# Operations

There are lots of operations with pandas that will be really useful to you, but don't fall into any distinct category. Let's show them here in this lecture:

In [None]:
import pandas as pd
import numpy as np
import string
import random

# Generating the random data
np.random.seed(1)  # For reproducibility
data = {
    "Integers": np.random.randint(1, 10, 10),
    "RandomStrings": [''.join(np.random.choice(list(string.ascii_letters), 3)) for _ in range(10)],
    "Hours": [f"{np.random.randint(0, 24)}:{str(np.random.randint(0, 60)).zfill(2)}" for _ in range(10)],
}

# Creating the DataFrame
df = pd.DataFrame(data)

df


### Info on Unique Values

In [None]:
df['Integers'].unique()

In [None]:
df['RandomStrings'].nunique()

In [None]:
df['Hours'].value_counts()

### Selecting Data

In [None]:
#Select from DataFrame using criteria from multiple columns
newdf = df[(df['Integers']>=2) & (df['RandomStrings']=="DoY")]

In [None]:
newdf

### Applying Functions

In [None]:
def times2(x):
    return x*2

In [None]:
df['Integers'].apply(times2)

In [None]:
df['RandomStrings'].apply(len)

In [None]:
df['Integers'].sum()

** Permanently Removing a Column**

In [None]:
del df['Hours']

In [None]:
df

** Get column and index names: **

In [None]:
df.columns

In [None]:
df.index

** Sorting and Ordering a DataFrame:**

In [None]:
df

In [None]:
df.sort_values(by='RandomStrings') #inplace=False by default

** Find Null Values or Check for Null Values**

In [None]:
df.isnull()

In [None]:
# Drop rows with NaN Values
df.dropna()

** Filling in NaN values with something else: **

In [None]:
import numpy as np

In [None]:
df = pd.DataFrame({'col1':[1,2,3,np.nan],
                   'col2':[np.nan,555,666,444],
                   'col3':['abc','def','ghi','xyz']})
df.head()

In [None]:
df.fillna('FILL')

In [None]:
data = {'A':['foo','foo','foo','bar','bar','bar'],
     'B':['one','one','two','two','one','one'],
       'C':['x','y','x','y','x','y'],
       'D':[1,3,2,5,4,1]}

df = pd.DataFrame(data)

In [None]:
df

In [None]:
df.pivot_table(values='D',index=['A', 'B'],columns=['C'])