# Pandas Tutorial 01
Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

### Why we use Pandas?

Numpy and Pandas are the base libraries for Exploratory Data Analysis.

### Topics we cover in this Tutorial
Following operation we will cover in this Pandas tutorial.
- Creating DataFrames
- Accessing raws and columns with .loc[] and .iloc[] accessors.
- Cheching the shape dimension and size of a DataFrame with .shape .ndim and .size attribites.
- Checking null values in all columns with <b>df.isnull().sum()</b>.
- Checking how many times a value is repeating in a perticular column with <b>df['column'].value_counts()</b>.
- Checking unique values in a perticular column with <b>df['column'].unique()</b> it returns array.


In [1]:
# First of all lets import Numpy and Pandas and give them short discriptive names like 'np' and 'pd'.
import numpy as np
import pandas as pd

#### Generating Data Frame with Numpy and Pandas 

In [2]:
df = pd.DataFrame(data=np.arange(1,21).reshape(4,5),
                  index=np.arange(101,105),
                  columns=['c1', 'c2', 'c3', 'c4', 'c5'])
df

Unnamed: 0,c1,c2,c3,c4,c5
101,1,2,3,4,5
102,6,7,8,9,10
103,11,12,13,14,15
104,16,17,18,19,20


In [12]:
# Converting DataFrame into CSV file.
# df.to_csv('Test.csv')

In [25]:
# loc[] Accessor
print(df.loc[101])
print(type(df.loc[101]))
print(df.loc[101:103,'c2':'c4'])
print(type(df.loc[101:103,'c2':'c4']))

c1    1
c2    2
c3    3
c4    4
c5    5
Name: 101, dtype: int32
<class 'pandas.core.series.Series'>
     c2  c3  c4
101   2   3   4
102   7   8   9
103  12  13  14
<class 'pandas.core.frame.DataFrame'>


In [29]:
# iloc[] Accessor
print(df.iloc[0])
print(type(df.iloc[0]))
print(df.iloc[0:3,1:4])
print(type(df.iloc[0:3,1:4]))

c1    1
c2    2
c3    3
c4    4
c5    5
Name: 101, dtype: int32
<class 'pandas.core.series.Series'>
     c2  c3  c4
101   2   3   4
102   7   8   9
103  12  13  14
<class 'pandas.core.frame.DataFrame'>


In [32]:
# checking shape, size and number of dimensions with .shape .size and .ndim attributes
print(f"shape = {df.shape}")
print(f"size = {df.size}")
print(f"number of dimensions = {df.ndim}")

shape = (4, 5)
size = 20
number of dimensions = 2


In [40]:
# Checking null values in all columns with df.isnull().sum()
df.isnull()

Unnamed: 0,c1,c2,c3,c4,c5
101,False,False,False,False,False
102,False,False,False,False,False
103,False,False,False,False,False
104,False,False,False,False,False


In [16]:
df.isnull().sum()

c1    0
c2    0
c3    0
c4    0
c5    0
dtype: int64

In [21]:
df['c1'].value_counts()

6     1
11    1
1     1
16    1
Name: c1, dtype: int64

In [49]:
df['c1'].unique()

array([51, 53, 55, 57])

In [42]:
df[['c1','c2']] = np.arange(51,59).reshape(4,2)

In [39]:
df 

Unnamed: 0,c1,c2,c3,c4,c5
101,51,52,3,4,5
102,53,54,8,9,10
103,55,56,13,14,15
104,57,58,18,19,20


In [3]:
test = pd.read_csv('Test.csv', index_col=0)

In [4]:
test

Unnamed: 0,c1,c2,c3,c4,c5
101,1,2,3,4,5
102,6,7,8,9,10
103,11,12,13,14,15
104,16,17,18,19,20
