# Jupyter Playground

## Overview

[Pandas](https://pandas.pydata.org/) is one of the most widely used Python libraries in data science. We have imported the basic libraries that will help you perform the following commonly used data wrangling operations/tools in Pandas:

* Creating DataFrames

* Slicing DataFrames (i.e. selecting rows and columns)
* Filtering data (using boolean arrays and groupby.filter)
* Aggregating (using groupby.agg)
* Visualizing data (using matplotlib.pyplot)

We have provided 2 dummy CSV files (the `username` dataset and `email` dataset) to help you get started. Of course, feel free to import your own libraries and datasets!

In [None]:
import numpy as np
import pandas as pd

import matplotlib
import matplotlib.pyplot as plt

from js import fetch

First, let's create the `fetch_and_read` function in order to read and store the data from the CSV files we are importing. 

In [11]:
async def fetch_and_read(data_url):
    res = await fetch(data_url)
    text = await res.text()
    
    filename = 'data.csv'
    
    with open(filename, 'w') as f:
        f.write(text)
        
    data = pd.read_csv(filename, sep=';')
    return data

We will display the 5 rows of the `email` dataset for reference below. This dataset contains fake onboarding information of 5 new employees. The columns include:

* `Login email` __(str):__ Employee login email

* `Identifier` __(int):__ Unique employee ID

* `First name` __(str):__ First name of employee

* `Last name` __(str):__ Last name of employee

In [None]:
email = await fetch_and_read("https://support.staffbase.com/hc/en-us/article_attachments/360009197071/email.csv")
email

We will display the 5 rows of the `username` dataset for reference below. This dataset contains fake onboarding information of 5 new employees. The columns include:

* `Username` __(str):__ Employee username

* `Identifier` __(int):__ Unique employee ID

* `One-time password` __(str):__ Employee access code

* `Recovery code` __(str):__ Employee recovery code

* `First name` __(str):__ First name of employee

* `Last name` __(str):__ Last name of employee

* `Department` __(str):__ Department that the employee belongs to

* `Location` __(str):__ Where the employee is located

In [None]:
username = await fetch_and_read("https://support.staffbase.com/hc/en-us/article_attachments/360009197011/username-password-recovery-code.csv")

# Remove space in the column name "Identifier" in order to make merge easier later on by having uniform column names
username = username.rename(columns={" Identifier": "Identifier"})
username

## Example

Here, we are using `df.merge` to inner merge the `username` dataset with the `email` dataset to find what the corresponding emails are for each username. Notice how `booker12` didn't have an email and was thus left out of the merged dataframe.

In [None]:
employee_username_and_email = username.merge(email, how="inner", on="Identifier")
employee_username_and_email

## Your turn

Write your own code snippets here and create new cells as you see fit!

In [None]:
# Your code