## Series

In pandas, a Series is a **one-dimensional labeled array** that can hold any data type. It is similar to a column in a spreadsheet or a single column of data in a DataFrame. A Series consists of two main components: the data and the index.

Here's an example of creating a Series in pandas:

```python
import pandas as pd

# Create a Series from a list
data = [10, 20, 30, 40, 50]
s = pd.Series(data)
print(s)
```

Output:
```
0    10
1    20
2    30
3    40
4    50
dtype: int64
```

In the above example, we create a Series from a list `data`. By default, pandas assigns a numeric index starting from 0 to each element in the Series. The data type of the Series is `int64`.

A Series can also have a custom index that provides labels for each element. Here's an example:

```python
import pandas as pd

# Create a Series with custom index
data = [10, 20, 30, 40, 50]
index = ['A', 'B', 'C', 'D', 'E']
s = pd.Series(data, index=index)
print(s)
```

Output:
```
A    10
B    20
C    30
D    40
E    50
dtype: int64
```

In this case, we specify a custom index `['A', 'B', 'C', 'D', 'E']` when creating the Series. The index labels provide a way to identify and access the elements in the Series.

You can also create a Series from a dictionary in pandas, you can use the `pd.Series` constructor and pass the dictionary as an argument. The keys of the dictionary will be used as the index labels, and the values will be the corresponding data in the Series.

Here's an example:

```python
import pandas as pd

# Create a dictionary
data = {'A': 10, 'B': 20, 'C': 30}

# Create a Series from the dictionary
series = pd.Series(data)

print(series)
```

Output:
```
A    10
B    20
C    30
dtype: int64
```

In the above example, we first create a dictionary `data` with keys `'A'`, `'B'`, and `'C'`, and their corresponding values `10`, `20`, and `30`. Then, we pass this dictionary to `pd.Series()` to create a Series called `series`. The resulting Series has the dictionary keys as index labels and the dictionary values as the data in the Series.

Here are a few ways you can work with Series in pandas:

1. **Accessing data:** You can use the index labels to access specific elements in the Series. For example, `s['C']` returns the element with index label 'C'.

2. **Series operations:** Series objects support various operations such as arithmetic operations, element-wise operations, and statistical operations. For instance, you can use `s + 5` to add 5 to each element of the Series.

3. **Index operations:** You can perform operations on the index, such as reindexing, resetting the index, and checking for the existence of an index label. For example, `s.reindex(['A', 'B', 'F'])` returns a new Series with the specified index labels, filling missing values with NaN.

4. **Data alignment:** Series objects align data based on their index labels, which allows for easy computation with Series objects that have different lengths or indexes.

5. **Series attributes and methods:** Series objects have various attributes and methods to provide information about the data, such as `s.shape` to get the dimensions of the Series, `s.mean()` to calculate the mean of the elements, and `s.unique()` to retrieve the unique values.

These are just a few examples of how Series can be used in pandas. Series provide a flexible and efficient way to work with one-dimensional data and are commonly used in pandas along with DataFrames for data analysis and manipulation tasks.

In [25]:
import pandas as pd

data = [10, 20, 30, 40, 50]
index = ['A', 'B', 'C', 'D', 'E']
s = pd.Series(data, index=index)
print(s)

A    10
B    20
C    30
D    40
E    50
dtype: int64


### `loc` and `iloc`

In pandas, the `loc` and `iloc` methods are used to access and manipulate data in a Series object.

The Series object is a one-dimensional labeled array that can hold any data type. It can be thought of as a column in a spreadsheet or a database table.

Here's an explanation of the `loc` and `iloc` methods in pandas Series:

1. **`loc` method:** The `loc` method is used to access elements in a Series by label. It accepts a single label or a list of labels and returns the corresponding elements.

Syntax:
```python
series.loc[label]
series.loc[[label1, label2, ...]]
```

Example:
``` python
import pandas as pd

series = pd.Series([10, 20, 30, 40], index=['A', 'B', 'C', 'D'])
print(series.loc['B'])  # Output: 20
print(series.loc[['A', 'C']])  # Output: 
# A    10
# C    30
# dtype: int64
```

2. **`iloc` method:** The `iloc` method is used to access elements in a Series by integer-based position. It accepts a single integer or a list of integers and returns the corresponding elements.

Syntax:
```python
series.iloc[position]
series.iloc[[position1, position2, ...]]
```

Example:
``` python
import pandas as pd

series = pd.Series([10, 20, 30, 40], index=['A', 'B', 'C', 'D'])
print(series.iloc[1])  # Output: 20
print(series.iloc[[0, 2]])  # Output: 
# A    10
# C    30
# dtype: int64
```

In both `loc` and `iloc`, the label or integer-based positions can be single values or a list of values. This allows you to select multiple elements from the Series at once.

It's worth noting that the `loc` method is inclusive of the last value, while the `iloc` method is exclusive. For example, `series.loc['A':'C']` will include elements with labels 'A', 'B', and 'C', whereas `series.iloc[0:2]` will include elements at positions 0 and 1, but not 2.

These methods are powerful tools for indexing, slicing, and manipulating data in a pandas Series based on labels or positions.

In [41]:
import pandas as pd

series = pd.Series([10, 20, 30, 40], index=['A', 'B', 'C', 'D'])

In [42]:
series.loc['A']

10

In [43]:
series.iloc[1]

20

In [44]:
series.loc['A':'C']

A    10
B    20
C    30
dtype: int64

In [45]:
series.iloc[0:2]

A    10
B    20
dtype: int64

> **You can also use fancy indexing in Pandas series**

In [46]:
series.loc[['A', 'C']]

A    10
C    30
dtype: int64

### Some Series attributes

In pandas Series, the `index` and `values` attributes provide access to the index labels and the corresponding data values, respectively.

1. **`index`**: The `index` attribute returns the index labels associated with the Series. It represents the labels that uniquely identify each element in the Series. The index can be of any data type, such as integers, strings, or dates.

Here's an example:

```python
import pandas as pd

series = pd.Series([10, 20, 30], index=['A', 'B', 'C'])
print(series.index)
```

Output:
```
Index(['A', 'B', 'C'], dtype='object')
```

In the above example, `series.index` returns an `Index` object containing the index labels `'A'`, `'B'`, and `'C'`. The `dtype='object'` indicates that the index labels are of the object data type.

2. **`values`**: The `values` attribute returns the data values of the Series. It provides access to the underlying array of data that the Series holds.

Here's an example:

```python
import pandas as pd

series = pd.Series([10, 20, 30], index=['A', 'B', 'C'])
print(series.values)
```

Output:
```
[10 20 30]
```

In the above example, `series.values` returns an array containing the data values `[10, 20, 30]`. The values are returned as a one-dimensional NumPy array.

You can access and manipulate the index labels and values of a Series using these attributes. For example, you can iterate over the index labels using `series.index` and access the corresponding values using `series.values`.

In [31]:
import pandas as pd

series = pd.Series([10, 20, 30], index=['A', 'B', 'C'])

In [32]:
series.index

Index(['A', 'B', 'C'], dtype='object')

In [33]:
series.values

array([10, 20, 30])

### Some Series methods

1. **`head()`:**
The `head(n)` method returns the first n rows of the Series. By default, it returns the first 5 rows.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
print(series.head(3))
```

Output:
```
0    10
1    20
2    30
dtype: int64
```

2. **`tail(n)`:**
The `tail(n)` method returns the last n rows of the Series. By default, it returns the last 5 rows.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
print(series.tail(3))
```

Output:
```
2    30
3    40
4    50
dtype: int64
```

3. **`describe()`:**
The `describe()` method computes various descriptive statistics of the Series, such as count, mean, standard deviation, minimum, maximum, and quartiles.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
print(series.describe())
```

Output:
```
count     5.000000
mean     30.000000
std      15.811388
min      10.000000
25%      20.000000
50%      30.000000
75%      40.000000
max      50.000000
dtype: float64
```

4. **`sum()`:**
The `sum()` method returns the sum of all the elements in the Series.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
print(series.sum())
```

Output:
```
150
```

5. **`mean()`:**
The `mean()` method returns the mean (average) of the elements in the Series.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
print(series.mean())
```

Output:
```
30.0
```

In [34]:
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])

In [35]:
series.head(3)

0    10
1    20
2    30
dtype: int64

In [36]:
series.tail(3)

2    30
3    40
4    50
dtype: int64

In [37]:
series.describe()

count     5.000000
mean     30.000000
std      15.811388
min      10.000000
25%      20.000000
50%      30.000000
75%      40.000000
max      50.000000
dtype: float64

In [38]:
series.sum()

150

In [39]:
series.mean()

30.0

### Boolean masking

Boolean masking in pandas Series allows you to filter and select elements based on a boolean condition. It involves using a boolean array or a Series of the same length as the original Series to specify which elements should be selected.

Here's how boolean masking works in pandas Series:

1. Creating a boolean condition:
First, you create a boolean condition by applying a comparison or logical operation to the Series. This results in a boolean array or Series where each element corresponds to the outcome of the condition for the corresponding element in the original Series.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
condition = series > 30
```

In the above example, the boolean condition `series > 30` checks if each element in the `series` Series is greater than 30. The resulting boolean condition is `[False, False, False, True, True]`.

2. Applying the boolean mask:
Next, you use the boolean condition to select the desired elements from the original Series. You can do this by passing the boolean condition inside square brackets `[]`.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
condition = series > 30

filtered_series = series[condition]
print(filtered_series)
```

Output:
```
3    40
4    50
dtype: int64
```

In the above example, `series[condition]` applies the boolean mask to the `series` Series. It returns a new Series that contains only the elements for which the corresponding value in the boolean condition is `True`. In this case, it selects the elements at positions 3 and 4, which have values 40 and 50, respectively.

You can also combine multiple conditions using logical operators such as `&` (and) and `|` (or) to create more complex boolean masks.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
condition = (series > 20) & (series < 50)

filtered_series = series[condition]
print(filtered_series)
```

Output:
```
2    30
3    40
dtype: int64
```

In this example, the boolean condition `(series > 20) & (series < 50)` creates a mask that selects elements greater than 20 and less than 50. It returns a new Series containing the element at position 1, which has a value of 30.

Boolean masking is a powerful technique for filtering and selecting specific elements in a pandas Series based on a condition. It allows you to perform various operations, such as filtering out outliers, selecting values within a certain range, or finding elements that meet specific criteria.

In [49]:
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50])
condition = series > 30

series[condition]

3    40
4    50
dtype: int64

In [50]:
series[(series > 20) & (series < 50)]

2    30
3    40
dtype: int64

### Deleting items from Series

To delete an item from a pandas Series, you can use the `drop` method. The `drop` method allows you to remove one or more items from the Series by specifying their index labels.

Here's how you can delete an item from a Series:

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])
print("Original Series:")
print(series)

# Delete an item by index label
modified_series = series.drop('C')
print("\nModified Series:")
print(modified_series)
```

Output:
```
Original Series:
A    10
B    20
C    30
D    40
E    50
dtype: int64

Modified Series:
A    10
B    20
D    40
E    50
dtype: int64
```

In the above example, we have a Series `series` with index labels `'A'`, `'B'`, `'C'`, `'D'`, and `'E'`. We want to delete the item with index label `'C'`. We use the `drop` method and pass the index label `'C'` as an argument. The `drop` method returns a modified Series `modified_series` with the item at index label `'C'` removed.

Note that the `drop` method returns a new Series with the item(s) removed and does not modify the original Series. If you want to modify the Series in place, you can pass the `inplace=True` parameter to the `drop` method.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])
print("Original Series:")
print(series)

# Delete an item by index label (inplace)
series.drop('C', inplace=True)
print("\nModified Series:")
print(series)
```

Output:
```
Original Series:
A    10
B    20
C    30
D    40
E    50
dtype: int64

Modified Series:
A    10
B    20
D    40
E    50
dtype: int64
```

In this example, the `drop` method is used with `inplace=True` to modify the original Series `series` by deleting the item with index label `'C'`.

You can also delete multiple items from a Series by passing a list of index labels to the `drop` method.

```python
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])
print("Original Series:")
print(series)

# Delete multiple items by index labels
modified_series = series.drop(['B', 'D'])
print("\nModified Series:")
print(modified_series)
```

Output:
```
Original Series:
A    10
B    20
C    30
D    40
E    50
dtype: int64

Modified Series:
A    10
C    30
E    50
dtype: int64
```

In this example, the `drop` method is used to delete multiple items with index labels `'B'` and `'D'` from the Series.

By using the `drop` method, you can easily delete specific items from a pandas Series while preserving the integrity of the original data.

In [58]:
import pandas as pd

series = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])
series

A    10
B    20
C    30
D    40
E    50
dtype: int64

In [59]:
modified_series = series.drop('C')
modified_series

A    10
B    20
D    40
E    50
dtype: int64

> **Note that `drop` returns a new series with the deleted row if you don't specify `inplace=True` argument**

In [60]:
series

A    10
B    20
C    30
D    40
E    50
dtype: int64

In [61]:
series.drop('C', inplace=True)
series

A    10
B    20
D    40
E    50
dtype: int64

In [62]:
series = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])
print("Original Series:")
print(series)

modified_series = series.drop(['B', 'D'])
print("\nModified Series:")
print(modified_series)

Original Series:
A    10
B    20
C    30
D    40
E    50
dtype: int64

Modified Series:
A    10
C    30
E    50
dtype: int64
