- As numpy ndarrays are homogeneous, pandas relaxes this requirement and allows for various dtypes in its data structures.

---
### Series -
- The Series is one building block in pandas. Pandas Series is a one-dimensional labeled array that can hold data of any type (integer, string, float, python objects, etc.), similar to a column in an excel spreadsheet. The axis labels are collectively called index.

In [1]:
import pandas as pd

print(pd.Series([1, 2, 3], index=['a', 'b', 'c'])) # with index

a    1
b    2
c    3
dtype: int64


In [2]:
import numpy as np
import pandas as pd

print(pd.Series(np.array([1, 2, 3]), index=['a', 'b', 'c'])) # from a 1darray

a    1
b    2
c    3
dtype: int64


In [3]:
import pandas as pd

print(pd.Series({'a': 1, 'b': 2, 'c':3})) # from a dict

a    1
b    2
c    3
dtype: int64


In [4]:
import pandas as pd

series = pd.Series({'a': 1, 'b': 2, 'c':3})
print(series['a'])

1


### DataFrames 
---
- In data science, data is usually more than one-dimensional, and of different data types; thus Series is not sufficient. DataFrames are 2darrays with both row and column labels. One way to create a DataFrame from scratch is to pass in a dict. 

In [5]:
import pandas as pd

wine_dict = {
    'red_wine': [3, 6, 5],
    'white_wine':[5, 0, 10]
}
sales = pd.DataFrame(wine_dict, index=["adam", "bob", "charles"])
print(sales)

         red_wine  white_wine
adam            3           5
bob             6           0
charles         5          10


In [6]:
import pandas as pd

wine_dict = {
    'red_wine': [3, 6, 5],
    'white_wine':[5, 0, 10]
}
sales = pd.DataFrame(wine_dict, index=["adam", "bob", "charles"])
print(sales['white_wine'])

adam        5
bob         0
charles    10
Name: white_wine, dtype: int64


- If we don’t supply index, the DataFrame will generate an integer index starting from 0
----
----
##### Insepect a dataframe - shape and size 
- Let’s take a look at a new DataFrame, in addition to heights and ages of the presidents, there is information on the order, names and parties. The DataFrame presidents_df is read from a CSV file as follows. Note that index is set to be the names of presidents

In [7]:
import pandas as pd

presidents_df = pd.read_csv('https://sololearn.com/uploads/files/president_heights_party.csv', index_col='name')
                                  
print(presidents_df.shape)

(45, 4)


In [8]:
                                  
print(presidents_df.shape[0])

45


In [9]:
                                  
print(presidents_df.size)

180


- Here both methods. .shape and .size, work in the same way as with numpy ndarrays.
-----
----
#### Rows with .loc 
-  
Instead of memorizing the integer positions to locate the order, age, height, and party information of Abraham Lincoln, with DataFrame, we can access it by the name using .loc


In [10]:
print(presidents_df.loc['Abraham Lincoln'])

order             16
age               52
height           193
party     republican
Name: Abraham Lincoln, dtype: object


In [11]:
print(type(presidents_df.loc['Abraham Lincoln']))
print(presidents_df.loc['Abraham Lincoln'].shape)

<class 'pandas.core.series.Series'>
(4,)


In [12]:
print(presidents_df.loc['Abraham Lincoln':'Ulysses S. Grant'])

                  order  age  height           party
name                                                
Abraham Lincoln      16   52     193      republican
Andrew Johnson       17   56     178  national union
Ulysses S. Grant     18   46     173      republican


.loc[ ] allows us to select data by label or by a conditional statement.