# Session 2. Pandas Data Structures

## The previous section introduced the Pandas DataFrame and Series objects. These data structures resemble the primitive Python data containers (lists and dictionaries) for indexing and labeling, but have additional features that make working with data easier.


## This session will cover:
1. Loading in manual data
2. The Series object
3. Basic operations on Series objects
4. The DataFrame object
5. Conditional subsetting and fancy slicing and indexing
6. Saving out data”



# 0. Let's load some libraries

In [40]:
import pandas as pd

# 1. Let's create our own data

## 1.1. Creating a Series

### The Pandas Series is a one-dimensional container, similar to the built-in Python list. 
### A Pandas Series is very similar to a Python list, except each element must be the same dtype. 


In [41]:
mySeries=pd.Series(['banana, apple, coconut,orange'])

In [42]:
mySeries

0    banana, apple, coconut,orange
dtype: object

In [43]:
otherSeries=pd.Series(['4.5','3.5','2.3','0.5','6.7'])

In [44]:
otherSeries

0    4.5
1    3.5
2    2.3
3    0.5
4    6.7
dtype: object

## 1.2. Creating a DataFrame from a dictionary

### The DataFrame is the most common Pandas object. It can be thought of as Python’s way of storing spreadsheet-like data.

### A DataFrame can be thought of as a dictionary of Series objects. This is why dictionaries are the the most common way of creating a DataFrame by hand.
### The key represents the column name, and the values are the contents of the column.



In [55]:
dictionaryOfScientists={'Name': ['Rosaline Franklin', 'William Gosset','Alexander Flemming'],
                        'Occupation': ['Chemist', 'Statistician','MDoctor'],
                        'Born': ['1920-07-25', '1876-06-13','1881-08-06'],
                        'Died': ['1958-04-16', '1937-10-16','1954-03-11'],
                        'Age': [37, 61,73]}



In [56]:
dictionaryOfScientists

{'Name': ['Rosaline Franklin', 'William Gosset', 'Alexander Flemming'],
 'Occupation': ['Chemist', 'Statistician', 'MDoctor'],
 'Born': ['1920-07-25', '1876-06-13', '1881-08-06'],
 'Died': ['1958-04-16', '1937-10-16', '1954-03-11'],
 'Age': [37, 61, 73]}

In [57]:
DataFrameOfScientists = pd.DataFrame(dictionaryOfScientists)

In [58]:
DataFrameOfScientists

Unnamed: 0,Name,Occupation,Born,Died,Age
0,Rosaline Franklin,Chemist,1920-07-25,1958-04-16,37
1,William Gosset,Statistician,1876-06-13,1937-10-16,61
2,Alexander Flemming,MDoctor,1881-08-06,1954-03-11,73


## 1.3. Creating a DataFrame from records

### we can also create a DataFrame from some existing records 

In [49]:
stocksRecords = [(3000, 'Google'), (213, 'Microsoft'), (1250, 'Apple'), (4500, 'Facebook'),(100, 'Tesla')]

In [50]:
stocksRecords

[(3000, 'Google'),
 (213, 'Microsoft'),
 (1250, 'Apple'),
 (4500, 'Facebook'),
 (100, 'Tesla')]

In [51]:
stocksDataFrame=pd.DataFrame.from_records(stocksRecords, columns=['Price', 'Company'])

In [52]:
stocksDataFrame

Unnamed: 0,price,company
0,3000,Google
1,213,Microsoft
2,1250,Apple
3,4500,Facebook
4,100,Tesla


## 2. Let's select some data

### 2.1. Selection based on column name

In [54]:
DataFrameOfScientists['Name']

0    Rosaline Franklin
1       William Gosset
Name: Name, dtype: object

In [60]:
DataFrameOfScientists[['Name','Occupation']]

Unnamed: 0,Name,Occupation
0,Rosaline Franklin,Chemist
1,William Gosset,Statistician
2,Alexander Flemming,MDoctor
