### 1 `pandas` series and data frames 

introduce two core objects in the `pandas` library, the `pandas.Series` and the `pandas.DataFrame`. 
We want to gain familiarity with these two objects, understand their relation to each other, and review Python data structures with dictionaries and lists.


### `pandas`

`pandas` is a Python package to wrangle and analyze tabular data. It is built on top of NumPy and has become the core tool for doing data analysis in Python. The standard abbreviation for `pandas` is `pd`.


In [7]:
# Always import packages in a single cell, each package should be in a new line

import pandas as pd
import numpy as np

### Series 

The first core object of pandas is the series. A series is a one-dimensional array of indexed data

A `pandas.Series` having an index is the main difference between a `pandas.Series` and a NumPy array.

In [2]:
# A numpy array 

array = np.random.randn(4) # random values from std normal distribution 
print(type(array))
print(array, "\n")

<class 'numpy.ndarray'>
[ 1.09796344  1.36370408  0.08745993 -0.72353854] 



In [4]:
# A pandas series made from the previous array 

s = pd.Series(array)
print(type(s))
print(s)

<class 'pandas.core.series.Series'>
0    1.097963
1    1.363704
2    0.087460
3   -0.723539
dtype: float64


The index is printed as part of the `pandas.Series` while the np.array is indexable, the index is not apart of the data structure. Printing the `pandas.Series` also shows the values and data type.

### Creating a `pandas.Series`

the basic method to creating a pandas series is to call: 

s = pd.Series(data, index = index) 

Data can be a list or a NumPy array, python dictionary, single number, boolean (True/False) string.

The index parameter is optional 

In [10]:
# A series from a numpy array 
pd.Series(np.arange(3), index=[2023, 2024, 2025])

2023    0
2024    1
2025    2
dtype: int64

In [11]:
# A series from a list of strings with default index 

pd.Series(['EDS 220', 'EDS 222', 'EDS 223', 'EDS 242'])

0    EDS 220
1    EDS 222
2    EDS 223
3    EDS 242
dtype: object

Example: Creating a `pandas.Series` from a dictionary

a dictionary is a set of key-value pairs. If we create a `pandas.Series` via a dictionary, the keys will become the index and the values the corresponding data.

In [13]:
# Construct a dictionary 

d = {'key_0': 2,
     'key_1': 3,
     'key_2': 5}

#Initialize series using a dictionary

pd.Series(d)

key_0    2
key_1    3
key_2    5
dtype: int64