## `Series` and `DataFrame`
In this section we will learn how to:
- create a `Series` from a `DataFrame` column
- create a `Series` from a `list`
- use `pl.Config` to change the length of strings printed out

We will look at the two objects for storing data in eager mode: the `Series` and `DataFrame`

In [None]:
import polars as pl

In [None]:
csvFile = "../data/titanic.csv"

In [None]:
df = pl.read_csv(csvFile)
df.head(3)

## Creating a `Series` from a `DataFrame` column

We can create a `Series` from a `DataFrame` column with square brackets. We can also call the `.head` method on a `Series` to restrict the number of rows output.

In [None]:
df["Name"].head(3)

Like a `DataFrame` column a `Series` can have a name. 

By default `Polars` prints the first 14 characters of strings. We can adjust the number of characters with a function in the `pl.Config` namespace

In [None]:
pl.Config.set_fmt_str_lengths(100)
df["Name"].head(3)

## Create a `DataFrame` from a `Series`
We can convert a `Series` into a one-column `DataFrame`

In [None]:
s = df["Name"]
s.to_frame().head(3)

## Create a `Series` from a `list`
We can create a `Series` from a python `list`

In [None]:
values = [1,2,3]
pl.Series(values)

If the `name` argument is not set then it defaults to an empty string. The name can be passed as follows

In [None]:
pl.Series(name='vals',values=values)

## Create a `Series` from a `list`
Or we can convert a `Series` to a `list` with `to_list` 

In [None]:
pl.Series(name='vals',values=values).to_list()

We can also convert a `Series` into a one-column `DataFrame`

In [None]:
pl.Series(name='vals',values=values).to_frame()

## Create a `DataFrame` from a `list`
We can create a `DataFrame` with:
- a list of `lists` with data
- a list of string column names

In [None]:
(
    pl.DataFrame(
        [values],
        schema=["vals"])
    .head(2)
)

We can also pass a `dict` to the `schema` argument with dtype. In this example we specify a 32-bit integer type for the `vals` column

In [None]:
(
    pl.DataFrame(
        [values],
        schema={"vals":pl.Int32})
    .head(2)
)

In  the exercises we see how to create a `DataFrame` from a `dict`.

In the section Selecting Columns and Transforming DataFrames we see how to add a column to a `DataFrame` from a list.

## Exercises
In the exercises you will develop your understanding of:
- extracting a `Series` from a `DataFrame`
- getting metadata from a `Series`
- creating a `Series` from a `list`
- creating a `DataFrames` from `lists`

### Exercise 1
Extract the `Age` column as a `Series` and then find:
- the `dtype` of the `Series`
- the median of the `Series`

In [None]:
df = pl.read_csv(csvFile)
s = <blank>

In [None]:
df = pl.read_csv(csvFile)
s = <blank>

### Exercise 2
You have the following Python `lists` with data.  

In [None]:
groups = ["a","a","b","b","c"]
values = [0,1,2,3,4]

Create a `Series` called `groupsSeries` from the `groups` list. The name inside the `Series` should be `groups`

Create a `DataFrame` by passing these as a Python `dict` to `pl.DataFrame`

## Solutions

### Solution to exercise 1
Extract the `Age` column as a `Series` and find:
- the `dtype` of the `Series`
- the median of the `Series`

In [None]:
df = pl.read_csv(csvFile)
s = df["Age"]
s.dtype

In [None]:
df = pl.read_csv(csvFile)
s = df["Age"]
s.median()

### Solution to exercise 2
You have the following Python `lists` with data.  

In [None]:
groups = ["a","a","b","b","c"]
values = [0,1,2,3,4]

Create a `Series` called `groupsSeries` from the `groups` list. The name inside the `Series` should be `groups`

In [None]:
groupsSeries = pl.Series("groups",groups)

Create a `DataFrame` by passing these as a Python `dict` to `pl.DataFrame`

In [None]:
pl.DataFrame(
    {
        "groups":groups,
        "vals":values
    }
)