# Pandas_Starter

# `Series` objects
The `pandas` library contains these useful data structures:
* `Series` objects, that we will discuss now. A `Series` object is 1D array, similar to a column in a spreadsheet (with a column name and row labels).
* `DataFrame` objects. This is a 2D table, similar to a spreadsheet (with column names and row labels).
* `Panel` objects. You can see a `Panel` as a dictionary of `DataFrame`s. These are less used. 

## 1. Object Creation : Series

In [6]:
import numpy as np
import pandas as pd

data1 = pd.Series([2, 4, 6, 7, 8, np.nan])

print(data1)

0    2.0
1    4.0
2    6.0
3    7.0
4    8.0
5    NaN
dtype: float64


In [7]:
data1.values                     # print series values           

array([ 2.,  4.,  6.,  7.,  8., nan])

In [8]:
data1.index                      # print series index                               

RangeIndex(start=0, stop=6, step=1)

In [9]:
data1[3]                         # indexing

7.0

In [10]:
data1[1:4]                       # slicing

1    4.0
2    6.0
3    7.0
dtype: float64

### B. Custom Indexing

In [11]:
data2 = pd.Series([1, 2, 3, 4, 5], index=['a', 'b', 'c', 'z', 'g'])

print(data2)

a    1
b    2
c    3
z    4
g    5
dtype: int64


### C. Data can be scalar

In [12]:
data3 = pd.Series(5.0, index=[100, 200, 300])

print(data3)

100    5.0
200    5.0
300    5.0
dtype: float64


### D. Converting Dictionary to Series

In [13]:
population_dict = {'Bangalore': 38332521,
                   'Delhi': 26448193,
                   'Kolkota': 19651127,
                   'Chennai' : 19552839,
                   'Hyderabad' : 12882135}

population = pd.Series(population_dict)

print(population)

Bangalore    38332521
Delhi        26448193
Kolkota      19651127
Chennai      19552839
Hyderabad    12882135
dtype: int64


In [14]:
population['Delhi']

26448193

In [15]:
population['Bangalore':'Chennai']

Bangalore    38332521
Delhi        26448193
Kolkota      19651127
Chennai      19552839
dtype: int64

## 2. Object Creation : Dataframe

### A. From dictionary of Series objects

In [16]:
area_dict = {'Bangalore': 743033,
             'Delhi': 849393,
             'Kolkota': 984833,
             'Chennai' : 932849,
             'Hyderabad' : 839390}

area = pd.Series(area_dict)

city = pd.DataFrame({'Populn': population, 'Area': area})

print(city)

             Populn    Area
Bangalore  38332521  743033
Delhi      26448193  849393
Kolkota    19651127  984833
Chennai    19552839  932849
Hyderabad  12882135  839390


In [17]:
city.index                        # print Index

Index(['Bangalore', 'Delhi', 'Kolkota', 'Chennai', 'Hyderabad'], dtype='object')

In [18]:
city.columns                      # Print Colums

Index(['Populn', 'Area'], dtype='object')

In [19]:
city.values

array([[38332521,   743033],
       [26448193,   849393],
       [19651127,   984833],
       [19552839,   932849],
       [12882135,   839390]], dtype=int64)

In [20]:
city.keys()

Index(['Populn', 'Area'], dtype='object')

In [21]:
# DataFrame as Specialized Dicitonary 

print(city['Area'])                     # indexing

Bangalore    743033
Delhi        849393
Kolkota      984833
Chennai      932849
Hyderabad    839390
Name: Area, dtype: int64


### B. From a two-dimensional NumPy array

In [22]:
df1 = pd.DataFrame([[1,      np.nan, 2],
                   [2,      3,      5],
                   [np.nan, 4,      6]])

df1

Unnamed: 0,0,1,2
0,1.0,,2
1,2.0,3.0,5
2,,4.0,6


In [23]:
grid = np.random.rand(3, 2)

A = pd.DataFrame(grid,columns=['foo', 'bar'],index=['a', 'b', 'c'])

print(A)

        foo       bar
a  0.090773  0.448663
b  0.543983  0.864505
c  0.447330  0.090499
