# Pandas

Pandas is a powerful open-source data analysis and manipulation library for Python. It provides easy-to-use data structures and functions for efficiently working with structured data such as tabular, time series, and matrix data.

### Import Pandas

In [1]:
import pandas as pd #Imports the Pandas library into your Python code with the alias pd
import numpy as np #Imports the NumPy  library into your Python code with the alias np

Pandas has two objects, namely series and data frames.

# Object Series
Series objects have one data dimension, they don't have column names because they only have one column... And has an index..

In [2]:
name = ["Arka", "Tony", "Stark", "Mosky"]
print(name)

['Arka', 'Tony', 'Stark', 'Mosky']


In [3]:
dataname = pd.Series(name)
dataname

0     Arka
1     Tony
2    Stark
3    Mosky
dtype: object

Change series into array

In [4]:
dataname.values

array(['Arka', 'Tony', 'Stark', 'Mosky'], dtype=object)

Display index

In this example, df.index returns a RangeIndex object, indicating that the index labels start from 0 (inclusive) and end at 4 (exclusive), with a step of 1.

In [5]:
dataname.index

RangeIndex(start=0, stop=4, step=1)

Call data

In [6]:
dataname[2]

'Stark'

### Implicit and Explicit index

Implicit indexing is Pandas' default, where index labels are automatically generated as integers. Explicit indexing allows you to define custom index labels, ensuring each row has a unique label matching the data points.

In [7]:
dataname = pd.Series(["Arka", "Tony", "Stark", "Mosky"], index=['a','b','c','d'])
dataname

a     Arka
b     Tony
c    Stark
d    Mosky
dtype: object

In [8]:
dataname.values

array(['Arka', 'Tony', 'Stark', 'Mosky'], dtype=object)

In [9]:
dataname.index

Index(['a', 'b', 'c', 'd'], dtype='object')

Select data

In [10]:
#explicit index

dataname['a']

'Arka'

Even though we have created an explicit index, we can still access its implicit index.

In [11]:
#implicit index

dataname[3]

  dataname[3]


'Mosky'

When the implicit and explicit indexes match, accessing data will return the data associated with the explicit index. This ensures that explicit indexing takes precedence over implicit indexing when accessing data.

In [12]:
sameindex = pd.Series([10,100,1000,10000], index=[1,3,5,7])
sameindex

1       10
3      100
5     1000
7    10000
dtype: int64

In [13]:
sameindex[3]

100

In [15]:
#tries to find the element with index 0 in the Series, which does not exist.

sameindex[0]

KeyError: 0

Note : Use `iloc` if the implicit and explicit indexes match because it will cause an error if not used `iloc`.

### Data Slicing

In [16]:
sameindex = pd.Series([10,100,1000,10000], index=['b','c','e','a'])
sameindex

b       10
c      100
e     1000
a    10000
dtype: int64

In [17]:
#for example we will call from data c to data a in custom index
sameindex['c': 'a'] #slicing explicit

c      100
e     1000
a    10000
dtype: int64

In [18]:
#But when we slice its implicit index, only the starting point will appear because the implicit index is represented as a range.
sameindex[1:3] #index implisit

c     100
e    1000
dtype: int64

### LOC and ILOC

In situations where both the explicit and implicit indexes coincide, inconsistencies may arise. To mitigate this, we utilize the loc and iloc methods:

- loc(location): This method is employed to access data based on explicit index labels. It ensures retrieval based on the explicitly defined index.
- iloc: Conversely, iloc is utilized to access data by implicit (integer) index positions. It ensures retrieval based on the numerical sequence of the index, irrespective of any explicit labels.

By adhering to these rules, we can circumvent inconsistencies and accurately retrieve data, regardless of the index type.

In [19]:
sameindex = pd.Series([10,100,1000,10000], index=['b','c','e','a'])
sameindex

b       10
c      100
e     1000
a    10000
dtype: int64

When we access an index then what appears is the explicit index

In [20]:
sameindex['b']

10

In [21]:
#Slicing 2 parameter
sameindex['b' : 'e']

b      10
c     100
e    1000
dtype: int64

#### LOC

In [22]:
#loc

sameindex.loc['b'] #selecting indeks ekspLisit

10

In [23]:
sameindex.loc['b' : 'e'] #slicing indeks eksptisit

b      10
c     100
e    1000
dtype: int64

#### ILOC

In [24]:
#iloc

sameindex.iloc[0] #selecting indeks impLisit

10

In [25]:
sameindex.iloc[0:2] #slicing indeks impLisit

b     10
c    100
dtype: int64

### Dictionary to Series

In [26]:
dict_price = {"Book" : 2500,
             "Eraser" : 500,
             "Ruler" : 1200,
             "Pen" : 1500,
             "Pencil" : 1000}

In [27]:
dict_price

{'Book': 2500, 'Eraser': 500, 'Ruler': 1200, 'Pen': 1500, 'Pencil': 1000}

In [28]:
dict_stock = {"Book" : 1000,
             "Eraser" : 200,
             "Ruler" : 100,
             "Pen" : 300,
             "Pencil" : 500}

In [29]:
dict_stock

{'Book': 1000, 'Eraser': 200, 'Ruler': 100, 'Pen': 300, 'Pencil': 500}

#### pd.Series()

In [30]:
#dictionary to series transformation

price = pd.Series(dict_price)
price

Book      2500
Eraser     500
Ruler     1200
Pen       1500
Pencil    1000
dtype: int64

In [31]:
#dictionary to series transformation

stock = pd.Series(dict_stock)
stock

Book      1000
Eraser     200
Ruler      100
Pen        300
Pencil     500
dtype: int64