# Series
#### Table of Contents
- Basics
- Creating a Series
- Series Attributes
- Accessing Elements from Series
- Modifying a Series
- Other Methods on Series

## Basics
* A Pandas Series is like a column in a table.
* Series is a 1-D array, holding data values of a single variable, captured from multiple observations.
![Pandas_1](https://github.com/abdurahimank/Pandas_Tutorial/blob/main/images/Pandas_1.png?raw=true)

#### Series Attributes and Methods
Pandas Series come with various attributes and methods to help you manipulate and analyze data effectively. <br>Here are a few essential ones:

- **values**: Returns the Series data as a NumPy array.
- **index**: Returns the index (labels) of the Series.
- **shape**: Returns a tuple representing the dimensions of the Series.
- **size**: Returns the number of elements in the Series.
- **mean()**, **sum()**, **min()**, **max()**: Calculate summary statistics of the data.
- **unique()**, **nunique()**: Get unique values or the number of unique values.
- **sort_values()**, **sort_index()**: Sort the Series by values or index labels.
- **isnull()**, **notnull()**: Check for missing (NaN) or non-missing values.
- **apply()**: Apply a custom function to each element of the Series.

In [2]:
import pandas as pd

## Creating a Series

**Creating Series from Lists and Tuples**

In [2]:
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78])
a

0    35.46
1    78.89
2    34.23
3    97.12
4    15.78
dtype: float64

**Key/Value Objects as Series**
* You can also use a key/value object, like a dictionary, when creating a Series.
* The keys of the dictionary become the labels.

In [13]:
import pandas as pd
calories = {"day1": 420, "day2": 380, "day3": 390}
myvar = pd.Series(calories)
print(myvar)

day1    420
day2    380
day3    390
dtype: int64


In [12]:
# Selecting specific objects
import pandas as pd
calories = {"day1": 420, "day2": 380, "day3": 390}
myvar = pd.Series(calories, index = ["day1", "day2"])
print(myvar)

day1    420
day2    380
dtype: int64


#### Labels
- If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1 etc.

In [3]:
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78])
print(a)
print(a[0])

0    35.46
1    78.89
2    34.23
3    97.12
4    15.78
dtype: float64
35.46


* With the ```index``` argument, you can name your own labels.
* When you have created labels, you can access an item by referring to the label.

In [4]:
# using index when creting series
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78], index = ["Brasil", "Russia", "India", 
                                                            "China", "SA"])
print(a)
print(a["Russia"])

Brasil    35.46
Russia    78.89
India     34.23
China     97.12
SA        15.78
dtype: float64
78.89


In [5]:
# using index after creating series
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78])
print(a)
a.index = ["Brasil", 
          "Russia", 
          "India",
          "China",
          "SA"]
print(a)
print(a["Russia"])

0    35.46
1    78.89
2    34.23
3    97.12
4    15.78
dtype: float64
Brasil    35.46
Russia    78.89
India     34.23
China     97.12
SA        15.78
dtype: float64
78.89


## Series Attributes

#### name
giving name to a series

In [6]:
a.name = "G7 Population in millions"
a

Brasil    35.46
Russia    78.89
India     34.23
China     97.12
SA        15.78
Name: G7 Population in millions, dtype: float64

In [7]:
a.name

'G7 Population in millions'

#### type of elements
- ```dtype```

In [8]:
a.dtype

dtype('float64')

In [6]:
a.values

array([35.46, 78.89, 34.23, 97.12, 15.78])

In [7]:
type(a.values)

numpy.ndarray

In [8]:
a.name

'G7 Population in millions'

In [9]:
a.index

Index(['Brasil', 'Russia', 'India', 'China', 'SA'], dtype='object')

In [10]:
import pandas as pd
certificates_earned = pd.Series([8, 2, 5, 6], index=['Tom', 'Kris', 'Ahmad', 'Beau'])
certificates_earned

Tom      8
Kris     2
Ahmad    5
Beau     6
dtype: int64

## Accessing Elements from Series
- ```seriesname[index value]```
- ```seriesname[label name]```
- ```loc``` - label based
- ```iloc``` - integer index based

In [13]:
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78], 
             index = ["Brasil", "Russia", "India", "China", "SA"], name = "Brics Nations GDP")
a

Brasil    35.46
Russia    78.89
India     34.23
China     97.12
SA        15.78
Name: Brics Nations GDP, dtype: float64

In [16]:
# accessing by index
print(a[2])

34.23


In [11]:
# accessing by index name
print(a["China"])

97.12


In [12]:
# with "iloc" attribute
print(a.iloc[4])

15.78


In [18]:
# multiple elements with index name
print(a[["India", "China"]])

India    34.23
China    97.12
Name: Brics Nations GDP, dtype: float64


In [19]:
# multiple elements with "iloc"
print(a.iloc[[0, 4]])

Brasil    35.46
SA        15.78
Name: Brics Nations GDP, dtype: float64


#### Conditional Selection

In [21]:
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78], 
             index = ["Brasil", "Russia", "India", "China", "SA"], name = "Brics Nations GDP")
a

Brasil    35.46
Russia    78.89
India     34.23
China     97.12
SA        15.78
Name: Brics Nations GDP, dtype: float64

In [23]:
a[a > 50]

Russia    78.89
China     97.12
Name: Brics Nations GDP, dtype: float64

## Modifying a Series

In [24]:
a = pd.Series([35.46, 78.89, 34.23, 97.12, 15.78], 
             index = ["Brasil", "Russia", "India", "China", "SA"], name = "Brics Nations GDP")
a

Brasil    35.46
Russia    78.89
India     34.23
China     97.12
SA        15.78
Name: Brics Nations GDP, dtype: float64

In [25]:
a["Brasil"] = 50
a

Brasil    50.00
Russia    78.89
India     34.23
China     97.12
SA        15.78
Name: Brics Nations GDP, dtype: float64

In [26]:
a[a < 50] = 50
a

Brasil    50.00
Russia    78.89
India     50.00
China     97.12
SA        50.00
Name: Brics Nations GDP, dtype: float64

## Other Methods on Series