# What is Pandas?
Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of Python programming language.
https://pandas.pydata.org/about/index.html

**Pandas Series**
A Pandas Series is like a column in a table. It is a 1-D Array holding data of any Type.

## Importing Pandas

In [1]:
import numpy as np

In [2]:
import pandas as pd

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


## Series from List

In [3]:
# String
country = ['Bharat','Mongolia','USA','Nepal','Indonesia']
pd.Series(country)

0       Bharat
1     Mongolia
2          USA
3        Nepal
4    Indonesia
dtype: object

In [66]:
# integer
runs = [31,58,100,29,0,105]
pd.Series(runs)

0     31
1     58
2    100
3     29
4      0
5    105
dtype: int64

In [5]:
# Custom Index
marks = [90,98,95,98,100]
subject = ['Data Science','Machine Learning','Statistics','Hindi','English']

pd.Series(marks,index=subject)

Data Science         90
Machine Learning     98
Statistics           95
Hindi                98
English             100
dtype: int64

In [6]:
# setting a name 
marksheet = pd.Series(marks,index=subject,name ='Marks of Nishant')
marksheet

Data Science         90
Machine Learning     98
Statistics           95
Hindi                98
English             100
Name: Marks of Nishant, dtype: int64

In [7]:
marks = {
    'maths':67,
    'english': 100,
    'science':100,
    'hindi':98
}
marks_series = pd.Series(marks,name='Marks of Nishant')
marks_series

maths       67
english    100
science    100
hindi       98
Name: Marks of Nishant, dtype: int64

## Series Attribute

In [8]:
# size
marks_series.size

4

In [9]:
# dtype
marks_series.dtype

dtype('int64')

In [10]:
# is_unique
marks_series.is_unique

False

In [11]:
#index
marks_series.index

Index(['maths', 'english', 'science', 'hindi'], dtype='object')

In [12]:
# Values
marks_series.values

array([ 67, 100, 100,  98], dtype=int64)

## Series using read_csv

In [13]:
# with one col
df = pd.read_csv('E:\Data Science\Python for DS\DataSets\subs.csv')
type(df)

pandas.core.frame.DataFrame

In [14]:
# convert dataframe in csv using -----> squeeze = True
df = pd.read_csv('E:\Data Science\Python for DS\DataSets\subs.csv')
sub  = df.squeeze()
sub

0       48
1       57
2       40
3       43
4       44
      ... 
360    231
361    226
362    155
363    144
364    172
Name: Calories, Length: 365, dtype: int64

In [15]:
# With two Column
vk = pd.read_csv('E:\Data Science\Python for DS\DataSets\kohli_ipl.csv' , index_col = 'match_no')
run  = vk.squeeze()
type(run)

pandas.core.series.Series

In [16]:
df = pd.read_csv(r'E:\Data Science\Python for DS\DataSets\bollywood.csv', index_col = 'movie')
mv = df.squeeze()
type(mv)
mv

movie
Uri: The Surgical Strike                   Vicky Kaushal
Battalion 609                                Vicky Ahuja
The Accidental Prime Minister (film)         Anupam Kher
Why Cheat India                            Emraan Hashmi
Evening Shadows                         Mona Ambegaonkar
                                              ...       
Hum Tumhare Hain Sanam                    Shah Rukh Khan
Aankhen (2002 film)                     Amitabh Bachchan
Saathiya (film)                             Vivek Oberoi
Company (film)                                Ajay Devgn
Awara Paagal Deewana                        Akshay Kumar
Name: lead, Length: 1500, dtype: object

## Series Method

In [17]:
#head and tail
mv.head()

movie
Uri: The Surgical Strike                   Vicky Kaushal
Battalion 609                                Vicky Ahuja
The Accidental Prime Minister (film)         Anupam Kher
Why Cheat India                            Emraan Hashmi
Evening Shadows                         Mona Ambegaonkar
Name: lead, dtype: object

In [18]:
run.tail()

match_no
211     0
212    20
213    73
214    25
215     7
Name: runs, dtype: int64

In [19]:
# sample
mv.sample(5)

movie
Banjo (2016 film)       Riteish Deshmukh
Contract (2008 film)      Adhvik Mahajan
Gangaajal                     Ajay Devgn
Qarib Qarib Singlle          Irrfan Khan
Jail (2009 film)          Manoj Bajpayee
Name: lead, dtype: object

In [20]:
# value_counts ----> movie
mv.value_counts()

lead
Akshay Kumar        48
Amitabh Bachchan    45
Ajay Devgn          38
Salman Khan         31
Sanjay Dutt         26
                    ..
Diganth              1
Parveen Kaur         1
Seema Azmi           1
Akanksha Puri        1
Edwin Fernandes      1
Name: count, Length: 566, dtype: int64

In [21]:
# Sort_value -> inplace
run.sort_values()
run.sort_values(ascending=False)

match_no
128    113
126    109
123    108
164    100
120    100
      ... 
93       0
211      0
130      0
8        0
135      0
Name: runs, Length: 215, dtype: int64

In [22]:
run.sort_values(ascending=False).head(1).values[0]

113

In [23]:
run_copy= run.copy()
run_copy.sort_values(inplace=True,ascending=False)
run_copy

match_no
128    113
126    109
123    108
164    100
120    100
      ... 
93       0
211      0
130      0
8        0
135      0
Name: runs, Length: 215, dtype: int64

In [24]:
mv_copy = mv.copy()
mv_copy.sort_index(inplace=True, ascending=True)
mv_copy

movie
1920 (film)                   Rajniesh Duggall
1920: London                     Sharman Joshi
1920: The Evil Returns             Vicky Ahuja
1971 (2007 film)                Manoj Bajpayee
2 States (2014 film)              Arjun Kapoor
                                   ...        
Zindagi 50-50                      Veena Malik
Zindagi Na Milegi Dobara        Hrithik Roshan
Zindagi Tere Naam           Mithun Chakraborty
Zokkomon                       Darsheel Safary
Zor Lagaa Ke...Haiya!            Meghan Jadhav
Name: lead, Length: 1500, dtype: object

## Series Maths Method

In [25]:
vk.count()

runs    215
dtype: int64

In [26]:
# Sum and Product
sub.sum()

49510

In [27]:
# Mean -> Median -> Mode -> std -> var
print(sub.mean())
print(vk.median())
print(mv.mode())
print(sub.std())
print(vk.var())     

135.64383561643837
runs    24.0
dtype: float64
0    Akshay Kumar
Name: lead, dtype: object
62.6750230372527
runs    688.002478
dtype: float64


In [28]:
# min && max
sub.min()

33

In [29]:
# Maximun
sub.max()

396

In [30]:
# describe
run.describe()
# type(vk)

count    215.000000
mean      30.855814
std       26.229801
min        0.000000
25%        9.000000
50%       24.000000
75%       48.000000
max      113.000000
Name: runs, dtype: float64

## Series Indexing

In [31]:
# integer indexing
x = pd.Series([12,23,45,56,90])
x[1]

23

In [37]:
# Negative Indexing ---> Not work 
x[-1]    # It give an error run and check

In [40]:
mv[-1]

  mv[-1]


'Akshay Kumar'

In [43]:
marksheet[-1]

  marksheet[-1]


100

In [44]:
# slicing
run[5:16]

match_no
6      9
7     34
8      0
9     21
10     3
11    10
12    38
13     3
14    11
15    50
16     2
Name: runs, dtype: int64

In [46]:
mv[-5:]

movie
Hum Tumhare Hain Sanam      Shah Rukh Khan
Aankhen (2002 film)       Amitabh Bachchan
Saathiya (film)               Vivek Oberoi
Company (film)                  Ajay Devgn
Awara Paagal Deewana          Akshay Kumar
Name: lead, dtype: object

In [50]:
mv[-5:-2:2]

movie
Hum Tumhare Hain Sanam    Shah Rukh Khan
Saathiya (film)             Vivek Oberoi
Name: lead, dtype: object

In [54]:
# Fancy Indexing
run[[1,2,5,6,3]]

match_no
1     1
2    23
5     1
6     9
3    13
Name: runs, dtype: int64

In [58]:
# Indixing with labels:--> fancy Indexing
mv['Why Cheat India']

'Emraan Hashmi'

## Editing Series

In [59]:
marksheet

Data Science         90
Machine Learning     98
Statistics           95
Hindi                98
English             100
Name: Marks of Nishant, dtype: int64

In [61]:
# using indexing
marksheet[-1] = 99
marksheet

  marksheet[-1] = 99


Data Science        90
Machine Learning    98
Statistics          95
Hindi               98
English             99
Name: Marks of Nishant, dtype: int64

In [64]:
# What if index is not existing   --> a new item may be added in our series
marksheet['sst']= 95
marksheet

Data Science        90
Machine Learning    98
Statistics          95
Hindi               98
English             99
sst                 95
Name: Marks of Nishant, dtype: int64

In [73]:
# slicing 
print(runs)
runs[2:4] = [90,100]
print(runs)

[31, 58, 100, 100, 0, 105]
[31, 58, 90, 100, 0, 105]
