### PANDAS SERIES

In [1]:
!pip3 install pandas

Collecting pandas
  Downloading pandas-2.3.3-cp313-cp313-macosx_11_0_arm64.whl.metadata (91 kB)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pandas-2.3.3-cp313-cp313-macosx_11_0_arm64.whl (10.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.7/10.7 MB[0m [31m17.9 MB/s[0m  [33m0:00:00[0meta [36m0:00:01[0m
[?25hDownloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Installing collected packages: pytz, tzdata, pandas
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3/3[0m [pandas]2m2/3[0m [pandas]
[1A[2KSuccessfully installed pandas-2.3.3 pytz-2025.2 tzdata-2025.2


In [44]:
# Importing
import numpy as np
import pandas as pd
import requests
from io import StringIO
import warnings

# Series
`Series` is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:
```python
s = pd.Series(data, index=index)
```

Here, **data** can be many different things:
* Python dict

* an ndarray

* a scalar value (like 5)

The passed index is a list of axis labels. Thus, this separates into a few cases depending on what data is:

**From ndarray**
  - If data is an ndarray, index must be the same length as data. If no index is passed, one will be created having values [0, ..., len(data) - 1].

**From Dict**
  - If an index is passed, the values in data corresponding to the labels in the index will be pulled out.

**From Scaler**
  - If data is a scalar value, an index must be provided. The value will be repeated to match the length of index.

In [10]:
# Pandas Series Creation
# Method 1: Through List

# Series creation with default indexes
countries = ['India',"Pakistan","US","Russia","UK"]
pd.Series(countries)

# Series with integer values
marks = [50,60,70,80]
pd.Series(marks)

# Series creation with custom indexing
subject = ["Social Science","Maths","English","Science"]
marks_series = pd.Series(marks,index = subject)

# Method 2: Through dictionaries
marks = {
    'maths':67,
    'english':57,
    'science':89,
    'hindi':100
}
pd.Series(marks)

# Methid 3 : From Scaler
pd.Series(5,index = ['A','B','C','D'])

A    5
B    5
C    5
D    5
dtype: int64

### Series Attribute
* name - returns the name of the series
* size - returns the axis dimensions of the object, consistent with ndarray
* dtype - returns the datatype of the series
* index - returns the list of indexes of the series
* values - returns the list of values of the series


In [20]:
print(marks_series.size)
print(marks_series.dtype)
print(marks_series.index)
print(marks_series.values)
print(marks_series.empty)

4
int64
Index(['Social Science', 'Maths', 'English', 'Science'], dtype='object')
[50 60 70 80]
False


### Series import using CSV
- by default pandas read_csv method import the data as type = dataFrame but if we want to import it as a series then we can use .squeeze() to convert it into dataframe.
- Note: Before pandas 20 this squeeze is an attribute of read_csv method

In [45]:
def dataset(link):
    warnings.filterwarnings('ignore', message='Unverified HTTPS request')
    response = requests.get(link, verify=False)
    if response.status_code == 200:
        return StringIO(response.text)

In [47]:
# Dataset - 1
subs = pd.read_csv(dataset("https://drive.google.com/uc?export=download&id=1XQTnOAlodSzEqIQ0sJjnOv5zpG57OB4S")).squeeze()
subs

0       48
1       57
2       40
3       43
4       44
      ... 
360    231
361    226
362    155
363    144
364    172
Name: Subscribers gained, Length: 365, dtype: int64

In [58]:
# Dataset - 2
vk = pd.read_csv(dataset("https://drive.google.com/uc?export=download&id=19RVdLnwpCEO3GHzA3LuU2JhG7Or3UjFb"),index_col="match_no").squeeze()

In [59]:
# dataset - 3
movies = pd.read_csv(dataset("https://drive.google.com/uc?export=download&id=1H6XVxrhbinfe44s-ZHXKaGZF6gW_Qmzg"),index_col="movie").squeeze()
movies

movie
Uri: The Surgical Strike                   Vicky Kaushal
Battalion 609                                Vicky Ahuja
The Accidental Prime Minister (film)         Anupam Kher
Why Cheat India                            Emraan Hashmi
Evening Shadows                         Mona Ambegaonkar
                                              ...       
Hum Tumhare Hain Sanam                    Shah Rukh Khan
Aankhen (2002 film)                     Amitabh Bachchan
Saathiya (film)                             Vivek Oberoi
Company (film)                                Ajay Devgn
Awara Paagal Deewana                        Akshay Kumar
Name: lead, Length: 1500, dtype: object

### Series Method
* head() -> retunrs the series first 5 values
* tail() -> retunr the series last 5 values
* sample() -> return a random entry from the series
* value_counts() -> returns the freuqency of every datapoint in the series
* sort_values() -> returns the sorted order of the values -> acs default -> not a permanent function
* sort_index - >return the sorted order of the index -> asc default -> not a permanent function

In [54]:
movies.head()

movie
Uri: The Surgical Strike                   Vicky Kaushal
Battalion 609                                Vicky Ahuja
The Accidental Prime Minister (film)         Anupam Kher
Why Cheat India                            Emraan Hashmi
Evening Shadows                         Mona Ambegaonkar
Name: lead, dtype: object

In [55]:
movies.tail(2)

movie
Company (film)            Ajay Devgn
Awara Paagal Deewana    Akshay Kumar
Name: lead, dtype: object

In [57]:
movies.sample()
# you can specify the no of samples
movies.sample(3)

movie
Laila Majnu (2018 film)     Avinash Tiwary
Badrinath Ki Dulhania         Varun Dhawan
Chhodon Naa Yaar           Jimmy Sheirgill
Name: lead, dtype: object

In [61]:
vk.value_counts()
movies.value_counts()

lead
Akshay Kumar        48
Amitabh Bachchan    45
Ajay Devgn          38
Salman Khan         31
Sanjay Dutt         26
                    ..
Diganth              1
Parveen Kaur         1
Seema Azmi           1
Akanksha Puri        1
Edwin Fernandes      1
Name: count, Length: 566, dtype: int64

In [67]:
subs.sort_values(ascending=False).head(1).values[0]

np.int64(396)

In [68]:
movies.sort_index()

movie
1920 (film)                   Rajniesh Duggall
1920: London                     Sharman Joshi
1920: The Evil Returns             Vicky Ahuja
1971 (2007 film)                Manoj Bajpayee
2 States (2014 film)              Arjun Kapoor
                                   ...        
Zindagi 50-50                      Veena Malik
Zindagi Na Milegi Dobara        Hrithik Roshan
Zindagi Tere Naam           Mithun Chakraborty
Zokkomon                       Darsheel Safary
Zor Lagaa Ke...Haiya!            Meghan Jadhav
Name: lead, Length: 1500, dtype: object