## Topics
<br>

- ### pd.Series() Function
- ### Series with list
- ### keys and values and Index
- ### Dictionary in Series
- ### Name Parameter in Series
- ### use data csv file in Series
- ### pd.read_csv() Function
- ### .squeeze("columns") Method for pd.read_csv()
- ### .head()
- ### .tail()
- ### .sample()
<br>

- ### .value_counts()
- ### .sort_values()
- ### .sort_index()
- ### .copy()
- ### .sum()
- ### .product()
- ### .prod()
- ### .mean()
- ### .median()
- ### .mode()
- ### .std()
- ### .var()
- ### .min()
- ### .max()
- ### .describe()
- ### .info()

## Series Attributes

- ### .size
- ### .dtype
- ### .name
- ### .index
- ### .values
- ### .is_unique
- ### .ndim (same as numpy)
- ### .shape


# What is Series
<br>

- ### In Pandas, a series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating-point numbers, Python objects, etc.).
<br>

- ### It is similar to a NumPy array, but with an associated label for each element, which is called an index.

In [1]:
import pandas as pd
import numpy as np

In [2]:
country = ['India','Pakistan','USA','Nepal','Srilanka']
country

['India', 'Pakistan', 'USA', 'Nepal', 'Srilanka']

## Use pd.series()

In [3]:
pd.Series(country)

0       India
1    Pakistan
2         USA
3       Nepal
4    Srilanka
dtype: object

### Here 0 , 1, 2, 3 ,4 is the index and this Series represent as keys and values pair here index is the key and country name are Value
<br>

### and dtype Represent the data type

# Set Custom Index 

In [4]:
# custom index
marks = [67,57,89,100]
subjects = ['maths','english','science','hindi']

pd.Series(marks,index=subjects) # here subject (keys) are index and numbers are the Values

maths       67
english     57
science     89
hindi      100
dtype: int64

# Use Dictionary in Series

In [5]:
marks = {
    'maths':67,
    'english':57,
    'science':89,
    'Urdu':100
}

marks_series = pd.Series(marks)
marks_series

maths       67
english     57
science     89
Urdu       100
dtype: int64

## Here index are the keys of Dictionary 

<br>

# Name Parameter
<br>

In [6]:
marks =  pd.Series(marks,name="Mubeen ky Marks")
marks

maths       67
english     57
science     89
Urdu       100
Name: Mubeen ky Marks, dtype: int64

# Common Series Attributes

- ### .size

In [7]:
marks.size # check the size

4

- ### .dtype

In [8]:
marks.dtype # check data type

dtype('int64')

- ### .name

In [9]:
marks.name # check the name

'Mubeen ky Marks'

- ### .index

In [10]:
marks.index # check index (keys)

Index(['maths', 'english', 'science', 'Urdu'], dtype='object')

- ### values

In [11]:
marks.values

array([ 67,  57,  89, 100])

- ### is_unique

In [12]:
marks.is_unique

True

- ### .shape

In [13]:
marks.shape

(4,)

# Working with csv file
<br>

In [14]:
!ls datasets/

bollywood.csv  ipl-matches.csv	kohli_ipl.csv  movies.csv  subs.csv


## pd.read_csv()

In [15]:
data = pd.read_csv("./datasets/subs.csv")
data

Unnamed: 0,Subscribers gained
0,48
1,57
2,40
3,43
4,44
...,...
360,231
361,226
362,155
363,144


## Check Type

In [16]:
type(data)

pandas.core.frame.DataFrame

## Check Dimension

In [17]:
data.ndim

2

## Note in Default pd.read_csv() treat as a Dataframe So used .squeeze("columns") method

In [18]:
data = pd.read_csv("./datasets/subs.csv").squeeze("columns")
data

0       48
1       57
2       40
3       43
4       44
      ... 
360    231
361    226
362    155
363    144
364    172
Name: Subscribers gained, Length: 365, dtype: int64

## Now check again type 

In [19]:
type(data)

pandas.core.series.Series

## Check Dimension

In [20]:
data.ndim

1

# Now Change the datatype
## 2 Dimension data type

In [21]:
!ls datasets/kohli_ipl.csv

datasets/kohli_ipl.csv


In [22]:
data =  pd.read_csv("datasets/kohli_ipl.csv").squeeze("columns")
data

Unnamed: 0,match_no,runs
0,1,1
1,2,23
2,3,13
3,4,12
4,5,1
...,...,...
210,211,0
211,212,20
212,213,73
213,214,25


## Check Data Type

In [23]:
type(data)

pandas.core.frame.DataFrame

## Use index_col='match_no for treat as 1 Dimension 

In [24]:
data =  pd.read_csv("datasets/kohli_ipl.csv",index_col="match_no").squeeze("columns")
data

match_no
1       1
2      23
3      13
4      12
5       1
       ..
211     0
212    20
213    73
214    25
215     7
Name: runs, Length: 215, dtype: int64

## Now again check the Data Type

In [25]:
type(data)

pandas.core.series.Series

## .head()

In [26]:
data.head() # return 5 top rows

match_no
1     1
2    23
3    13
4    12
5     1
Name: runs, dtype: int64

In [27]:
data.head(3) # change no of rows

match_no
1     1
2    23
3    13
Name: runs, dtype: int64

## .tail()

In [28]:
data.tail() # return 5 last rows

match_no
211     0
212    20
213    73
214    25
215     7
Name: runs, dtype: int64

In [29]:
data.tail(3) # change no of rows

match_no
213    73
214    25
215     7
Name: runs, dtype: int64

## .sample()

In [30]:
data.sample() # random sample 1 rows

match_no
155    4
Name: runs, dtype: int64

In [31]:
data.sample() # random sample 1 rows

match_no
215    7
Name: runs, dtype: int64

In [32]:
data.sample(5) # change no of sample

match_no
7      34
103    51
17     22
185    33
30     38
Name: runs, dtype: int64

## .value_counts()

In [33]:
data.value_counts()

0     9
1     8
12    8
9     7
35    6
     ..
36    1
45    1
71    1
37    1
53    1
Name: runs, Length: 78, dtype: int64

In [34]:
movies = pd.read_csv('./datasets/bollywood.csv',index_col='movie').squeeze("columns")
movies

movie
Uri: The Surgical Strike                   Vicky Kaushal
Battalion 609                                Vicky Ahuja
The Accidental Prime Minister (film)         Anupam Kher
Why Cheat India                            Emraan Hashmi
Evening Shadows                         Mona Ambegaonkar
                                              ...       
Hum Tumhare Hain Sanam                    Shah Rukh Khan
Aankhen (2002 film)                     Amitabh Bachchan
Saathiya (film)                             Vivek Oberoi
Company (film)                                Ajay Devgn
Awara Paagal Deewana                        Akshay Kumar
Name: lead, Length: 1500, dtype: object

In [35]:
movies.value_counts() # values_count()

Akshay Kumar        48
Amitabh Bachchan    45
Ajay Devgn          38
Salman Khan         31
Sanjay Dutt         26
                    ..
Diganth              1
Parveen Kaur         1
Seema Azmi           1
Akanksha Puri        1
Edwin Fernandes      1
Name: lead, Length: 566, dtype: int64

## .sort_values()

In [36]:
movies.sort_values()

movie
Qaidi Band                            Aadar Jain
Roar: Tigers of the Sundarbans      Aadil Chahal
Lipstick Under My Burkha            Aahana Kumra
Raat Gayi Baat Gayi?                Aamir Bashir
Talaash: The Answer Lies Within       Aamir Khan
                                        ...     
Dil Toh Deewana Hai                  Zeenat Aman
Sallu Ki Shaadi                      Zeenat Aman
Strings of Passion                   Zeenat Aman
Dunno Y... Na Jaane Kyon             Zeenat Aman
Taj Mahal: An Eternal Love Story     Zulfi Sayed
Name: lead, Length: 1500, dtype: object

In [37]:
movies.sort_values(ascending=False) # descending order

movie
Taj Mahal: An Eternal Love Story     Zulfi Sayed
Dil Toh Deewana Hai                  Zeenat Aman
Strings of Passion                   Zeenat Aman
Sallu Ki Shaadi                      Zeenat Aman
Dunno Y... Na Jaane Kyon             Zeenat Aman
                                        ...     
Fanaa (2006 film)                     Aamir Khan
Raat Gayi Baat Gayi?                Aamir Bashir
Lipstick Under My Burkha            Aahana Kumra
Roar: Tigers of the Sundarbans      Aadil Chahal
Qaidi Band                            Aadar Jain
Name: lead, Length: 1500, dtype: object

## .sort_index()

In [38]:
data.sort_index()

match_no
1       1
2      23
3      13
4      12
5       1
       ..
211     0
212    20
213    73
214    25
215     7
Name: runs, Length: 215, dtype: int64

In [39]:
data.sort_index(ascending=False) # descending order
data

match_no
1       1
2      23
3      13
4      12
5       1
       ..
211     0
212    20
213    73
214    25
215     7
Name: runs, Length: 215, dtype: int64

In [40]:
data.sort_index(ascending=False,inplace=True) # inplace True parameter changes the data
data

match_no
215     7
214    25
213    73
212    20
211     0
       ..
5       1
4      12
3      13
2      23
1       1
Name: runs, Length: 215, dtype: int64

## .copy()

In [41]:
temp = data.copy()
temp

match_no
215     7
214    25
213    73
212    20
211     0
       ..
5       1
4      12
3      13
2      23
1       1
Name: runs, Length: 215, dtype: int64

## .sum()

In [42]:
vk =  pd.read_csv("datasets/kohli_ipl.csv",index_col='match_no').squeeze("columns")
vk

match_no
1       1
2      23
3      13
4      12
5       1
       ..
211     0
212    20
213    73
214    25
215     7
Name: runs, Length: 215, dtype: int64

In [43]:
vk.sum() # sum

6634

- ###  .product()

In [55]:
vk.product()

0

- ### .prod()

In [56]:
vk.prod()

0

- ### .mean()

In [44]:
vk.mean()

30.855813953488372

- ### .median()

In [45]:
vk.median()

24.0

- ### .mode()

In [46]:
sub = pd.read_csv("./datasets/subs.csv").squeeze("columns")
sub

0       48
1       57
2       40
3       43
4       44
      ... 
360    231
361    226
362    155
363    144
364    172
Name: Subscribers gained, Length: 365, dtype: int64

In [47]:
sub.mode()

0    105
Name: Subscribers gained, dtype: int64

- ### .std()

In [48]:
sub.std()

62.6750230372527

- ### .var()

In [49]:
sub.var()

3928.1585127201565

- ### .min()

In [50]:
sub.min()

33

- ### .max()

In [51]:
sub.max()

396

- ### .describe()

In [52]:
sub.describe()

count    365.000000
mean     135.643836
std       62.675023
min       33.000000
25%       88.000000
50%      123.000000
75%      177.000000
max      396.000000
Name: Subscribers gained, dtype: float64

.info()

In [53]:
sub.info()

<class 'pandas.core.series.Series'>
RangeIndex: 365 entries, 0 to 364
Series name: Subscribers gained
Non-Null Count  Dtype
--------------  -----
365 non-null    int64
dtypes: int64(1)
memory usage: 3.0 KB
