# What kind of data does pandas handle?

In [4]:
import pandas as pd

In [5]:
pd.__version__

'1.1.5'

## pandas data table representation

### DataFrame

To manually store data in a table create a `DataFrame`.
When using a Python dictionary of lists, the dictionary keys will be used as column headers and the values in each list as columns of the `DataFrame`.

In [6]:
df = pd.DataFrame(
    {
        "Name": [
                 "Rossi, Paolo",
                 "Bianchi, Maria",
                 "Verdi, Luigi"
        ],
        "Age": [22, 30, 51],
        "Sex": ["M", "F", "M"],
    }
)

In [7]:
df

Unnamed: 0,Name,Age,Sex
0,"Rossi, Paolo",22,M
1,"Bianchi, Maria",30,F
2,"Verdi, Luigi",51,M


A `DataFrame` is a 2-dimensional data structure that can store data of different types in columns. It is similar to a spreadsheeet, a SQL table or the data.frame in R.

In this case:

- the table has 3 columns, each of them with a column label. The column labels are respectively Name, Age and Sex.
- the column Name consists of textual data with each value a string, the column Age are numbers and the column Sex is textual data.

### Each column in a `DataFrame` is a `Series`

In [8]:
df["Age"]

0    22
1    30
2    51
Name: Age, dtype: int64

You can create a `Series` from scratch as:

In [10]:
ages = pd.Series([25, 43, 19], name="Ages")

In [11]:
ages

0    25
1    43
2    19
Name: Ages, dtype: int64

### Do something with a DataFrame or Series

I want to know the maximum Age of the passengers

In [12]:
df["Age"].max()

51

I’m interested in some basic statistics of the numerical data of my data table

In [13]:
df.describe()

Unnamed: 0,Age
count,3.0
mean,34.333333
std,14.977761
min,22.0
25%,26.0
50%,30.0
75%,40.5
max,51.0


The `describe()` method provides a quick overview of the numerical data in a `DataFrame`. As the Name and Sex columns are textual data, these are by default not taken into account by the describe() method.

Many pandas operations return a DataFrame or a Series. The describe() method is an example of a pandas operation returning a pandas Series.