## Data Structures in Pandas

Pandas has two main data structures:
- DataFrame, which is two dimensional
- Series, which is one dimensional
![alt text](../img/01-PandasData.png)

### What is Pandas DataFrame

- A two dimensional data structure
- A row is represented by row labels, also called index, which may be numerical
  or string
- A column is represented by column labels which may be numerical or string
- Following DataFrame contains 10 rows (0-9) and 5 columns (name, calories,
  protein, vitamins, rating)
  ![alt text](../img/05-PandasDataFrame.png)

### What is Pandas Series

- A one dimensional data structure
- It consists of a single row or column
- Following Series contains 10 rows (0-9) and 1 column called calories
![alt text](../img/05-PandasSeries.png)

### Dataframe vs Series

- A Pandas Dataframe is just a collection of one or more Series
- The Series in the previous example was extracted from the Dataframe
![alt text](../img/05-PandasDataframeVsSeries.png)

## Creating a Dataframe using Lists

- We can create a Dataframe using Lists
- We pass the list as an argument to the `pandas.DataFrame()` function, which
  returns us a DataFrame
- Pandas automatically assigns numerical row labels to each row of the DataFrame
- Since we didn't provide column labels, Pandas automatically assigned numerical
  column labels to each column as well

In [1]:
import pandas as pd

myList = [
    ['Apple', 'Red'],
    ['Banana', 'Yellow'],
    ['Orange', 'Orange']
]

myDataFrame = pd.DataFrame(myList)

myDataFrame

Unnamed: 0,0,1
0,Apple,Red
1,Banana,Yellow
2,Orange,Orange


In [2]:
# With custom column labels
myDataFrame2 = pd.DataFrame(myList, columns=['Fruit', 'Color'])

myDataFrame2

Unnamed: 0,Fruit,Color
0,Apple,Red
1,Banana,Yellow
2,Orange,Orange


As we know that a NumPy array is similar to a Python List with added
functionality, we can also convert a NumPy array to a Pandas DataFrame using the
same method

In [3]:
import numpy as np
import pandas as pd

npArr = np.array([
    [0, 1],
    [2, 3],
    [4, 5]
])

myDataFrame3 = pd.DataFrame(npArr, columns=['Even', 'Odd'])

myDataFrame3

Unnamed: 0,Even,Odd
0,0,1
1,2,3
2,4,5
