## pandas
pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python.

### Importing Pandas

In [1]:
import numpy as np
import pandas as pd

### Series from lists

In [2]:
# String: from string create series
country = ['Banglades','India','Nepal','Butan','Maldiv','Srilanka','Arab','UK']

# covert string into pandas series

series = pd.Series(country)
series

0    Banglades
1        India
2        Nepal
3        Butan
4       Maldiv
5     Srilanka
6         Arab
7           UK
dtype: object

In [3]:
# integers. integers number to series

numbers = [12,15,14,16,17,18,19,20,13,89,100]
number = pd.Series(numbers)
number

0      12
1      15
2      14
3      16
4      17
5      18
6      19
7      20
8      13
9      89
10    100
dtype: int64

In [24]:
# custom index. I can set index name what column i want fo index start

marks = [56,89,72,45,96,13,25,48,69,57]
# pd.Series(marks)
subjects = ['Bangla','English','BGS','Science','Math','Higher Math','Physics','Chemistrt','Biology','IME']
# pd.Series(subjects)
total = pd.Series(marks, index=subjects)
total

Bangla         56
English        89
BGS            72
Science        45
Math           96
Higher Math    13
Physics        25
Chemistrt      48
Biology        69
IME            57
dtype: int64

In [12]:
# another indexing
pd.Series(subjects, index=marks)

56         Bangla
89        English
72            BGS
45        Science
96           Math
13    Higher Math
25        Physics
48      Chemistrt
69        Biology
57            IME
dtype: object

In [13]:
# setting a name. Series name deifine with name attributes

marks = pd.Series(marks, name = 'Mark sheet of Dhoha.')
marks

0    56
1    89
2    72
3    45
4    96
5    13
6    25
7    48
8    69
9    57
Name: Mark sheet of Dhoha., dtype: int64

### Series from dictionary. 
you can create 2 column series from dictionary

In [17]:
marks = {
    'Bangla': 85,
    'English':  90,
    'Physics': 82,
    'Chemistry': 87,
    'Higher Math': 88,
    'Math': 90
}
pd.Series(marks, name = 'Mark sheet of Dhoha.')

Bangla         85
English        90
Physics        82
Chemistry      87
Higher Math    88
Math           90
Name: Mark sheet of Dhoha., dtype: int64

### Series Attributes
* size-- show all total number of values
* dtype --- show which data type 
* name - as like table name. which data collection works. data collection name set
* is_unique -- how many unique value or not repeated value are there
* index --- index number or values show kinda keys
* values --- how many values here kinda values of dictionary

In [20]:
# size: show how many rows and columns here with value or without values
marks= pd.Series(marks)
marks.size

6

In [22]:
marks.dtype

dtype('int64')

In [23]:
# there is no unique values
marks.is_unique

False

In [25]:
# index
total.index

Index(['Bangla', 'English', 'BGS', 'Science', 'Math', 'Higher Math', 'Physics',
       'Chemistrt', 'Biology', 'IME'],
      dtype='object')

In [26]:
# values
total.values

array([56, 89, 72, 45, 96, 13, 25, 48, 69, 57], dtype=int64)

### Series Using read_csv
if we want to read any csv file from computer then we use this read_csv for import csv file or read csv file

In [36]:
# with one columns csv file . 
sub = pd.read_csv('subs.csv')
sub

Unnamed: 0,Subscribers gained
0,48
1,57
2,40
3,43
4,44
...,...
360,231
361,226
362,155
363,144


In [48]:
# 2 columns csv file

kohli_ipl = pd.read_csv('kohli_ipl.csv', index_col = 'match_no')
kohli_ipl

Unnamed: 0_level_0,runs
match_no,Unnamed: 1_level_1
1,1
2,23
3,13
4,12
5,1
...,...
211,0
212,20
213,73
214,25


In [45]:
# 2 cols both are strings
movies = pd.read_csv('bollywood.csv',index_col ='movie')
movies

Unnamed: 0_level_0,lead
movie,Unnamed: 1_level_1
Uri: The Surgical Strike,Vicky Kaushal
Battalion 609,Vicky Ahuja
The Accidental Prime Minister (film),Anupam Kher
Why Cheat India,Emraan Hashmi
Evening Shadows,Mona Ambegaonkar
...,...
Hum Tumhare Hain Sanam,Shah Rukh Khan
Aankhen (2002 film),Amitabh Bachchan
Saathiya (film),Vivek Oberoi
Company (film),Ajay Devgn


In [39]:
# Squeeze 1 dimensional axis objects into scalars
# index_col: This is to allow you to set which columns to be used as the index of the dataframe.

### Series Mathod
* head()
* tail()
* sample()
* value_counts
* sort_values ---> inplace
* sort_index -- inplace

### Series Math methods:
* count
* sum
* product
* mean
* median
* mode
* std - standard deviation
* var - varience
* min/max
* describe()
