##### <b> Pandas Series </b></br> - Series is equivalent to a column of data

In [9]:
import numpy as np
import pandas as pd

##### <b> The Index </b></br> Index lets you easily access 'rows' in Pandas Series or Dataframe

In [10]:
# create list of integers
sales = np.arange(5)

# convert to Pandas Series
sales_series = pd.Series(sales, name="Sales") # name will give hte column a header when joining or appending to other dataframes/series

sales_series

0    0
1    1
2    2
3    3
4    4
Name: Sales, dtype: int32

In [11]:
# Series can be accessed and sliced like other access sequence data types like base python
sales_series[2]
print(sales_series[2:4])
# however this is a better way

2    2
3    3
Name: Sales, dtype: int32


##### <b> Custom Indices </b></br> There are cases to use a custom index for accessing rows

In [12]:
# a list can be assigned to the index of a series as long as the value count matches
# for data analysis, default integer index is best
sales = [0, 5, 155, 0, 518]
items = ['coffee', 'bananas', 'tea', 'coconut', 'sugar']
# assigning items to the indexed_sales index column values
indexed_sales = pd.Series(sales, index=items, name='Indexed_Sales')
indexed_sales

coffee       0
bananas      5
tea        155
coconut      0
sugar      518
Name: Indexed_Sales, dtype: int64

In [13]:
# can call row value based on the index string
indexed_sales['tea']

155

In [14]:
# when slicing using index labels, the stop point is included, however numeric index the stop is no included
print()
print('using Index name does include stop point which is coconut')
print(indexed_sales['bananas':'coconut'])
print()
print('using Index Integer does not include stop point which is coconut')
print(indexed_sales[1:3])


using Index name does include stop point which is coconut
bananas      5
tea        155
coconut      0
Name: Indexed_Sales, dtype: int64

using Index Integer does not include stop point which is coconut
bananas      5
tea        155
Name: Indexed_Sales, dtype: int64


##### <b> .iloc[] Method </b></br> Preferred method to access values by positional index (using numeric) </br> - Method works even when Series have custom non-integer index </br> - more efficient </br> &nbsp;&nbsp; Series: seriesname.iloc[row position] </br> &nbsp;&nbsp; Dataframe: dataframename.iloc[row position, column position]

In [15]:
# can call row value using the iloc[] accessor 
print()
print('Single Index iloc[] Call')
print(indexed_sales.iloc[2])
print()
print('Index slice iloc[] Call')
print(indexed_sales.iloc[2:4])
print()
print('specific Indices iloc[] Call which requires nested list')
print(indexed_sales.iloc[[0, 2, 4]]) # require nested list to work
print()
print('Last Index iloc[] Call')
print(indexed_sales.iloc[-1]) 
print()
print('Reverse series Index iloc[] Call')
print(indexed_sales.iloc[::-1])


Single Index iloc[] Call
155

Index slice iloc[] Call
tea        155
coconut      0
Name: Indexed_Sales, dtype: int64

specific Indices iloc[] Call which requires nested list
coffee      0
tea       155
sugar     518
Name: Indexed_Sales, dtype: int64

Last Index iloc[] Call
518

Reverse series Index iloc[] Call
sugar      518
coconut      0
tea        155
bananas      5
coffee       0
Name: Indexed_Sales, dtype: int64


##### <b> .loc[] Method </b></br> Preferred method to access values by their custom labels </br> &nbsp;&nbsp; seriesname.loc[row label] </br> &nbsp;&nbsp; dataframename.loc[row label, column label] </br> - If row indices are numeric and default, loc[] method can use index number with column label

In [16]:
# can call row value using the loc[] accessor 
print()
print('Single Index loc[] Call')
print(indexed_sales.loc['coconut'])
print()
print('Single Index loc[] Call')
print(indexed_sales.loc['coffee':'coconut'])


Single Index loc[] Call
0

Single Index loc[] Call
coffee       0
bananas      5
tea        155
coconut      0
Name: Indexed_Sales, dtype: int64


##### <b>  Duplicate Index Values </b></br> Possible to have duplicate Index values in Pandas Series/Dataframe </br>- DO NOT SET DUPLICATE INDEX VALUES - </br> - accessing these indices using label .iloc[] returns all corresponding rows

In [17]:
# assigning list of values which includes a duplicate value. 
#####################################
# DO NOT SET DUPLICATE INDEX VALUES
#####################################
sales1 = [0, 5, 155, 0, 518]
items1 = ['coffee', 'coffee', 'tea', 'coconut', 'sugar']
# assigning items to the indexed_sales index column values
duplicate_index = pd.Series(sales1, index=items1, name='Duplicate_Index')
duplicate_index

coffee       0
coffee       5
tea        155
coconut      0
sugar      518
Name: Duplicate_Index, dtype: int64

##### <b>  Reseeting Index Values </b></br> Can reset index back to default range of integers using .reset_index() method </br> - by default, existing index will become new column in dataframe

In [18]:
#in series it will become a dataframe as default for deafault .reset_index()
duplicate_index.reset_index()

Unnamed: 0,index,Duplicate_Index
0,coffee,0
1,coffee,5
2,tea,155
3,coconut,0
4,sugar,518


In [19]:
# including drop=True, the index will reset and not include the previous index 
duplicate_index.reset_index(drop=True)

0      0
1      5
2    155
3      0
4    518
Name: Duplicate_Index, dtype: int64

In [20]:
# reset index in call series row indices
duplicate_index.reset_index(drop=True).loc[2:4]


2    155
3      0
4    518
Name: Duplicate_Index, dtype: int64

In [21]:
duplicate_index.reset_index(drop=True, inplace=True)
duplicate_index

0      0
1      5
2    155
3      0
4    518
Name: Duplicate_Index, dtype: int64