# 1. What kind of data does pandas handle?

In [1]:
import pandas as pd

*I want to store passenger data of the Titanic. For a number of passengers, I know the name (characters), age (integers) and sex (male/female) data.*

In [2]:
df = pd.DataFrame(
    {
        "Name": [
            "Braund, Mr. Owen Harris",
            "Allen, Mr. William Henry",
            "Bonnell, Miss. Elizabeth",
        ],
        "Age": [22,35,58],
        "Sex": ["male","male","female"],
    }
)

df

Unnamed: 0,Name,Age,Sex
0,"Braund, Mr. Owen Harris",22,male
1,"Allen, Mr. William Henry",35,male
2,"Bonnell, Miss. Elizabeth",58,female


To manually store data in a table, create a `DataFrame`. When using a Python dictionary of lists, the dictionary keys will be used as column headers and the values in each list as columns of the `DataFrame`.

A `DataFrame` is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns.

The table has 3 columns, each of them with a column label. The column labels are respectively `Name`, `Age` and `Sex`.

Each column in a `DataFrame` is a `Series`.

*I’m just interested in working with the data in the column `Age`.*

In [3]:
df["Age"]

0    22
1    35
2    58
Name: Age, dtype: int64

When selecting a single column of a pandas `DataFrame`, the result is a pandas `Series`. 

You can create a `Series` from scratch as well:

In [4]:
ages = pd.Series([22,35,58], name="Age")
ages

0    22
1    35
2    58
Name: Age, dtype: int64

A pandas `Series` has no column labels, as it is just a single column of a `DataFrame`. A Series does have row labels.

*I want to know the maximum Age of the passengers.*

In [5]:
df["Age"].max()

58

In [6]:
# Or, the series
ages.max()

58

*I’m interested in some basic statistics of the numerical data of my data table.*

In [7]:
df.describe()

Unnamed: 0,Age
count,3.0
mean,38.333333
std,18.230012
min,22.0
25%,28.5
50%,35.0
75%,46.5
max,58.0


The `describe()` method provides a quick overview of the numerical data in a `DataFrame`. As the `Name` and `Sex` columns are textual data, these are by default not taken into account by the `describe()` method.