##### <b> Pandas Series </b></br> - Series is equivalent to a column of data

In [1]:
import numpy as np
import pandas as pd

##### <b> Series are Pandas data structures built on top of NumPy arrays</b></br> - Series contain index and optional name with array of data </br> - Can be created from other data types, usually imported </br> - Two or more series grouped together form Pandas Dataframe

In [2]:
# create list of integers
sales = np.arange(5)

# convert to Pandas Series
sales_series = pd.Series(sales, name="Sales") # name will give hte column a header when joining or appending to other dataframes/series

sales_series

0    0
1    1
2    2
3    3
4    4
Name: Sales, dtype: int32

##### <b> Pandas Series have these key properties</b></br> - values: data array in the series </br> - index: index array in the series </br> - name: optional name for the serise (useful for accessing columns)</br> - dtype: dataype of elements in the values array

In [3]:
# accessing series values is accessing a numpy array
sales_series.values

array([0, 1, 2, 3, 4])

In [4]:
# can view index
sales_series.index

RangeIndex(start=0, stop=5, step=1)

In [5]:
# can change the index values using range or manually
sales_series.index = pd.RangeIndex(10, 51, 10)
sales_series

10    0
20    1
30    2
40    3
50    4
Name: Sales, dtype: int32

In [6]:
# aggregations can be performed
sales_series.mean()

2.0

In [7]:
# name can be updated
sales_series.name = "updatedSales"
sales_series

10    0
20    1
30    2
40    3
50    4
Name: updatedSales, dtype: int32

| Numeric Data Types| Library | Description                    | Bitsize          |
|-------------------|---------|--------------------------------|------------------|
| Bool              | NumPy   | Boolean True/False             | 8                |
| int64             | NumPy   | Whole Numbers                  | 8, 16, 32, 64   |
| float64           | NumPy   | Decimal Numbers                | 8, 16, 32, 64   |
| object            | NumPy   | Any Python Object              | N/A              |
| boolean           | Pandas  | Nullable Boolean True/False    | 8                |
| int64             | Pandas  | Nullable Whole Numbers         | 8, 16, 32, 64   |
| float64           | Pandas  | Nullable Decimal Numbers       | 8, 16, 32, 64   |
| string/text       | Pandas  | Text/String Data               | N/A              |
| category          | Pandas  | Maps categorical data to numerical array for efficiency| N/A              |
| datetime64        | Pandas  | single moment in time (January 4, 2015, 2:00:00PM)     | 64               |
| timedelta         | Pandas  | Duration between 2 dates or times             | N/A               |
| period   4        | Pandas  | A span on Time             | N/A               |


##### </br> <b> Object/Text Data Types</b></br>  object - Any Python Object </br> string - only contains strings or text </br> category - Maps categorical data to a numeric array for efficiency 

##### </br> <b> Time Series </b></br> datetime - a single moment in time (January 4, 2015, 2:00:00 PM) </br> timedelta  - The duration between two dates or times (1o days, 3 seconds, etc...) </br> period - a span of time (a day, a week, etc...)

##### <b> Type Conversion </b>

In [8]:
# using the method .astype("<Data Type>") you can convert values if they are compatible
print(sales_series)
print(sales_series.astype("bool"))
print(sales_series.astype("float"))
print(sales_series.astype("object"))
print(sales_series.astype("string"))

10    0
20    1
30    2
40    3
50    4
Name: updatedSales, dtype: int32
10    False
20     True
30     True
40     True
50     True
Name: updatedSales, dtype: bool
10    0.0
20    1.0
30    2.0
40    3.0
50    4.0
Name: updatedSales, dtype: float64
10    0
20    1
30    2
40    3
50    4
Name: updatedSales, dtype: object
10    0
20    1
30    2
40    3
50    4
Name: updatedSales, dtype: string


In [9]:
print(sales_series.astype("bool").mean())

0.8


In [10]:
# this cannot be converted - ValueError
# print(sales_series.astype("datetime64"))
# ValueError: The 'datetime64' dtype has no unit. Please pass in 'datetime64[ns]' instead.
