Pandas: can be thought of as enhanced versions of Numpy structured arrays, **in which the rows and columns are identified with labels rather than simple integer indices.**

In [1]:
import numpy as np
import pandas as pd

## Pandas Series Object

`Series`: 

- one-dimensional array of indexed data

- can be created from a list or array 

- wraps both a sequence of values and a sequence of indices, which we can access with the `values` and `index` attributes

    -  `values` are simply a familiar NumPy array
    
    -  `index` is an array-like object of type `pd.Index`
    
- can be accessed by the associated index via the familiar Python square-bracket notation

In [2]:
# Create series from a list
data = pd.Series([0.25, 0.5, 0.75, 1.0]) 
data

0    0.25
1    0.50
2    0.75
3    1.00
dtype: float64

In [3]:
# Access values
data.values 

array([0.25, 0.5 , 0.75, 1.  ])

In [4]:
# Access index
data.index

RangeIndex(start=0, stop=4, step=1)

In [5]:
# Access desired value
data[1:3]

1    0.50
2    0.75
dtype: float64

### `Series` as generalized Numpy array

While the Numpy Array has an *implicitly defined* integer index used to access the values, the Pandas `Series` has an ***explicitly defined*** index associated with the values.

In [7]:
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index=['a', 'b', 'c', 'd'])

In [8]:
data

a    0.25
b    0.50
c    0.75
d    1.00
dtype: float64

In [9]:
data['b']

0.5

We can even use **non-contiguous or non-sequential** indices:

In [10]:
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index=[2, 5, 3, 7])

In [11]:
data

2    0.25
5    0.50
3    0.75
7    1.00
dtype: float64

### Series as specialized dictionary

We can think of Pandas `Series` a bit like a specialization of a Python dictionary: A dictionary is a structure that maps arbitrary keys to a set of arbitrary values, and a `Series` is a structure which maps typed keys to a set of typed values.

But the type information of a Pandas `Series` makes it much more efficient than Python dictionaries for certain operations.

In [12]:
# Construct a Series object directly from a Python dictionary

population_dict = {
    'California': 38332521,
    'Texas': 26448193,
    'New York': 19651127,
    'Florida': 19552860,
    'Illinois': 12882135
}

population = pd.Series(population_dict)
population

California    38332521
Texas         26448193
New York      19651127
Florida       19552860
Illinois      12882135
dtype: int64

By default, a `Series` will be created where the index is drawn from the sorted keys.

In [13]:
population['California']

38332521

The `Series` also supports array-style operations.

E.g., slicing:

In [16]:
population['California':'Illinois':2]

California    38332521
New York      19651127
Illinois      12882135
dtype: int64

### Construct `Series` Object

Syntax: `pd.Series(data, index=index)`

`data` can be:

- list or Numpy array

In [17]:
pd.Series([2, 4, 6]) # list as data

0    2
1    4
2    6
dtype: int64

In [19]:
pd.Series(np.array([2, 4, 6])) # Numpy array as data

0    2
1    4
2    6
dtype: int64

- a scalar, which is repeated to fill the specified index

In [20]:
pd.Series(5, index=[100, 200, 300]) # scalar as data

100    5
200    5
300    5
dtype: int64

- a dictionary

> Note: \
> When the data is a dict, and an index is not passed, the Series index will be ordered by the dict’s insertion order, if you’re using Python version >= 3.6 and Pandas version >= 0.23.\
> If you’re using Python < 3.6 or Pandas < 0.23, and an index is not passed, the Series index will be the lexically ordered list of dict keys.\
> See: https://pandas.pydata.org/pandas-docs/stable/getting_started/dsintro.html

In [22]:
pd.Series({2:'two', 1:'one', 3:'three'})

2      two
1      one
3    three
dtype: object

In [25]:
pd.Series({2:'two', 1:'one', 3:'three'}, index=[2, 3])

2      two
3    three
dtype: object

In this case, the `Series` is populated only with the explicitly identified keys.