# Introduction to working with DataFrames
In basic python, we often use dictionaries containing our measurements as vectors. While these basic structures are handy for collecting data, they are suboptimal for further data processing. For that we introduce [panda DataFrames](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) which are more handy in the next steps. In Python, scientists often call tables "DataFrames".  

In [1]:
import pandas as pd

## Creating DataFrames from a dictionary of lists
Assume we did some image processing and have some results in available in a dictionary that contains lists of numbers:

In [2]:
measurements = {
    "labels":      [1, 2, 3],
    "area":       [45, 23, 68],
    "minor_axis": [2, 4, 4],
    "major_axis": [3, 4, 5],
}

This data structure can be nicely visualized using a DataFrame:

In [3]:
df = pd.DataFrame(measurements)
df

Unnamed: 0,labels,area,minor_axis,major_axis
0,1,45,2,3
1,2,23,4,4
2,3,68,4,5


Using these DataFrames, data modification is straighforward. For example one can append a new column and compute its values from existing columns:

In [4]:
df["aspect_ratio"] = df["major_axis"] / df["minor_axis"]
df

Unnamed: 0,labels,area,minor_axis,major_axis,aspect_ratio
0,1,45,2,3,1.5
1,2,23,4,4,1.0
2,3,68,4,5,1.25


We can also save this table for continuing to work with it.

In [5]:
df.to_csv("../../data/short_table.csv")

## Creating DataFrames from lists of lists
Sometimes, we are confronted to data in form of lists of lists. To make pandas understand that form of data correctly, we also need to provide the headers in the same order as the lists

In [6]:
header = ['labels', 'area', 'minor_axis', 'major_axis']

data = [
    [1, 2, 3],
    [45, 23, 68],
    [2, 4, 4],
    [3, 4, 5],
]
          
# convert the data and header arrays in a pandas data frame
data_frame = pd.DataFrame(data, header)

# show it
data_frame

Unnamed: 0,0,1,2
labels,1,2,3
area,45,23,68
minor_axis,2,4,4
major_axis,3,4,5


As you can see, this tabls is _rotated_. We can bring it in the usual form like this:

In [7]:
# rotate/flip it
data_frame = data_frame.transpose()

# show it
data_frame

Unnamed: 0,labels,area,minor_axis,major_axis
0,1,45,2,3
1,2,23,4,4
2,3,68,4,5
