# <center><div style="width: 370px;"> ![Panel Data](pictures/Panel_Data.jpg)

# <center>Series

In [1]:
import numpy as np
import pandas as pd

`Series` is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:

```python
s = pd.Series(Data, index=index)
```

Here, data can be many different things:

- a Python dict
- an ndarray
- a scalar value (like 5)

The passed ***index*** is a list of axis labels. Thus, this separates into a few cases depending on what data is:
**From ndarray**
If data is an ndarray, ***index*** must be the same length as ***data***. If no index is passed, one will be created having values $[0, ..., len(data) - 1].$

In [9]:
s = pd.Series(np.random.randint(0, 100, 5), index=['a', 'b', 'c', 'd', 'e'])

In [10]:
s

a    37
b    14
c     1
d    48
e    44
dtype: int64

> ***Note:*** pandas supports non-unique index values. If an operation that does not support duplicate index values is attempted, an exception will be raised at that time. The reason for being lazy is nearly all performance-based (there are many instances in computations, like parts of GroupBy, where the index is not used).

### From dict

Series can be instantiated from dicts:


In [16]:
d = {'a': 2, 'b': 4, 'c': 0}
s = pd.Series(d)

In [17]:
s.index

Index(['a', 'b', 'c'], dtype='object')

### From scaler value

If data is a scalar value, an index must be provided. The value will be repeated to match the length of index.

In [18]:
pd.Series(3, index=['a', 'b', 'c'])

a    3
b    3
c    3
dtype: int64

### Series is ndarry_like

Series acts very similarly to a ndarray, and is a valid argument to most NumPy functions. However, operations such as slicing will also slice the index.

In [21]:
s = pd.Series(np.random.random(10))
s

0    0.942829
1    0.894262
2    0.490630
3    0.129121
4    0.250253
5    0.623522
6    0.124655
7    0.903468
8    0.635358
9    0.045040
dtype: float64

In [22]:
s[0]

0.9428289466512927

In [23]:
s[:5]

0    0.942829
1    0.894262
2    0.490630
3    0.129121
4    0.250253
dtype: float64

In [24]:
s[s > s.median()]

0    0.942829
1    0.894262
5    0.623522
7    0.903468
8    0.635358
dtype: float64

In [25]:
np.exp(s)

0    2.567234
1    2.445531
2    1.633345
3    1.137827
4    1.284350
5    1.865486
6    1.132757
7    2.468147
8    1.887698
9    1.046069
dtype: float64

> **Note:** We will address array-based indexing like s[[4, 3, 1]] in section on indexing.

In [27]:
s[[3, 4, 5]]

3    0.129121
4    0.250253
5    0.623522
dtype: float64

In [28]:
s[3, 4, 5]

KeyError: 'key of type tuple not found and not a MultiIndex'

In [29]:
s.dtype

dtype('float64')

In [35]:
type(s.array)

pandas.core.arrays.numpy_.NumpyExtensionArray

In [32]:
s.to_numpy()

array([0.94282895, 0.89426247, 0.49062995, 0.12912065, 0.25025293,
       0.62352153, 0.12465474, 0.9034677 , 0.6353582 , 0.04503964])

In [37]:
type(s.to_numpy())

numpy.ndarray

### Series is dict-like

A Series is like a fixed-size dict in that you can get and set values by index label:

In [50]:
s = pd.Series(np.random.random(6), index=['a', 'b', 'c', 'd', 'e', 'f'])

In [51]:
s

a    0.164954
b    0.491002
c    0.170725
d    0.607901
e    0.185288
f    0.258652
dtype: float64

In [52]:
s["a"]

0.16495445266218822

In [53]:
'a' in s

True

In [54]:
'g' in s

False

#### Vectorized operations and label alignment with Series

When working with raw NumPy arrays, looping through value-by-value is usually not necessary. The same is true when working with Series in pandas. Series can also be passed into most NumPy methods expecting an ndarray.

In [55]:
s + s

a    0.329909
b    0.982004
c    0.341449
d    1.215802
e    0.370577
f    0.517304
dtype: float64

In [56]:
s ** 2

a    0.027210
b    0.241083
c    0.029147
d    0.369544
e    0.034332
f    0.066901
dtype: float64

In [57]:
s[1:] + s[:-1]

a         NaN
b    0.982004
c    0.341449
d    1.215802
e    0.370577
f         NaN
dtype: float64

The result of an operation between unaligned Series will have the ***union*** of the indexes involved. If a label is not found in one Series or the other, the result will be marked as missing NaN. Being able to write code without doing any explicit data alignment grants immense freedom and flexibility in interactive data analysis and research. The integrated data alignment features of the pandas data structures set pandas apart from the majority of related tools for working with labeled data

### Name attribute

Series can also have a name attribute:

In [58]:
s = pd.Series(np.random.randint(0, 100, 5), name="age")

In [59]:
s

0     6
1    84
2    31
3    18
4    32
Name: age, dtype: int64

In [60]:
s.name

'age'

### In Summary

**In Summary**

In this tutorial, you've taken your first steps into the world of Pandas, where data manipulation and analysis become a breeze. Here's a quick recap of what you've learned:

- You've uncovered the intricacies of the Series data structure, a versatile tool for handling one-dimensional data.

- You've mastered the art of building a Series, understanding how to construct this fundamental data structure from scratch.

- You've ventured into the world of operations, learning how to wield various operations to transform and analyze your Series data.

But our journey is far from over. In the next section, we'll set our sights on the DataFrame data structure. Unlike Series, DataFrames offer a broader canvas to work with. They can contain not just one, but multiple Series, forming a cohesive collection that unlocks even more powerful data analysis capabilities. So, stay with us as we dive deeper into the realm of DataFrames and explore the vast possibilities they bring to the table.