### 📘 **Welcome to Your First Notebook**

<span style="color:slateblue"><strong>This notebook is a hands-on introduction to using Jupyter Notebooks for data analysis.</strong></span>  
You'll learn how to run code, explore data, write notes, and understand the interface along the way.

This text you're reading is in a **cell**, specifically a **Markdown cell** — used to write formatted text, explanations, and notes. Instead of using buttons or menus, you add simple symbols (like `#`, `*`, or `[]`) to control how the text appears.  

[**Learn about Markdown by here**](https://www.markdownguide.org/)    

You can **see the cell type** in the bottom-right corner — it will say `"Markdown"` or `"Code"`. By pressing `Shift` `+` `Enter` to **render** the Markdown. Double-click the cell to go back into **edit mode** and make changes.

In [None]:
#  !!!!!  Time to Code  !!!!
#
# This is a code cell — a place to write and run Python code.
# You'll notice in the bottom-right corner of this cell it says "Python",
# which tells you it's a code cell using the Python language kernel.
# Try running this cell by clicking inside it and pressing Shift + Enter.
# You’ll see the result printed below.

print("Hello, Jupyter!")

## Organizing Your Analysis

One of the strengths of Jupyter Notebooks is how you can break your work into <span style="color:darkblue"><strong>clear, logical steps</strong></span> using cells.

Think of each **code cell** as a single action or idea:  
- 📥 Load your data in one cell  
- 🔍 Explore it in another  
- 📊 Visualize or transform it in the next  

This structure makes your analysis easier to <span style="color:darkgreen"><strong>follow, debug, and share</strong></span> — whether you're revisiting it later or handing it off to someone else.

---

In between code, use **Markdown cells** to capture your  
<span style="color:#8B008B"><strong>thoughts, assumptions, and observations</strong></span>.

This is especially helpful when:  
- You're working through ideas and want to reflect as you go  
- You're sharing the notebook with others  
- You're collaborating, and want teammates to follow your reasoning  

Think of it as building a <span style="color:firebrick"><strong>chain of thought</strong></span> — guiding anyone reading the notebook through not just **what** you did, but **why** you did it.

Insert a new cell using the **+** button. You can also use keyboard shortcuts to enhance your efficiency.   
[Jupyter Notebook Keyboard Shortcuts Cheat Sheet](https://mljar.com/blog/jupyter-notebook-shortcuts-cheatsheet/)

### Let's perform a simple data analysis using fake vehicle sales data to demonstrate how to break the analysis up into steps

<span style="color:#8B008B"><strong>How long does it typically take for a vehicle to sell after it's been wholesaled?</strong></span>

We’ll look at the time difference between the **WholesaleDate** and the **SaleDate** for each vehicle in the dataset.

This kind of focused question helps demonstrate the power of notebooks — we can break the problem into steps, write code to explore it, visualize the results, and document insights along the way.

In [None]:
# Load Vehicle Sales Data from CSV
import pandas as pd

# Load dataset from the relative path to a pandas dataframe
df = pd.read_csv("../data/fake_sales_data.csv")

# Display the first few rows of the data
df.head()

## Exploring the Data

Now that the dataset is loaded, the next step is to get a quick overview of what you're working with.

This includes:
- Seeing how many rows and columns there are
- Understanding what each column contains
- Checking for missing values
- Getting basic summary statistics

These steps help you decide what to clean, analyze, or visualize — and give you a sense of the data’s overall shape and quality.

In [None]:
# Run three commands in one cell
df.info()                
df.describe()            
df.isnull().sum()        

## Understanding Output in a Code Cell

In a Jupyter Notebook, only the **last expression** in a code cell will automatically display output — unless a function explicitly prints something.

- `df.info()` **prints** its result to the screen, so it always shows.
- `df.describe()` **returns** a DataFrame, but won’t display unless it’s the last line **OR** wrapped in `print()`. Like `print(df.describe()`)`
- `df.isnull().sum()` also returns a value, and since it's the **last line**, Jupyter shows it in the output by default.

To see the output of multiple commands in one cell, either:
- Use `print()` for the ones you want to show, or
- Split them into separate cells for clarity.

In [None]:
# To perform the analysis we'll need to Convert the WholesaleDate and
# SaleDate columns to datetime format as they are currently object data types
df["SaleDate"] = pd.to_datetime(df["SaleDate"])
df["WholesaleDate"] = pd.to_datetime(df["WholesaleDate"])

# Calculate days between wholesale and sale and store the result in a new column DaysToSell
df["DaysToSell"] = (df["SaleDate"] - df["WholesaleDate"]).dt.days

# Preview the updated DataFrame
df[["SaleDate", "WholesaleDate", "DaysToSell"]].head()

## Fixing mistakes and Iterating

One of the best things about using a notebook is that **you don’t have to get everything right on the first try**. If you make a mistake in a cell — whether it’s a typo, logic error, or wrong calculation — you can simply update the code and **re-run the cell**. You don’t need to start over or re-import your data from scratch. Your data is already in memory, and your notebook keeps track of each step as you go. **This makes it easy to experiment, test ideas, and fix issues without breaking your workflow.**

Now, let's visualize our data! 



In [None]:
# Let’s see how many vehicles were sold within different time ranges to get a 
# sense of how long cars typically take to sell.
import matplotlib.pyplot as plt

# 📊 Plot a histogram of days to sell
plt.figure(figsize=(8, 5))
plt.hist(df["DaysToSell"], bins=20, edgecolor="black")
plt.title("Distribution of Days to Sell a Vehicle")
plt.xlabel("Days")
plt.ylabel("Number of Vehicles")
plt.grid(True)
plt.show()

In [None]:
# Let’s compare how long it takes to sell vehicles in each region to 
# see where cars move faster or slower.
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 5))
sns.boxplot(data=df, x="Region", y="DaysToSell")
plt.title("Days to Sell by Region")
plt.xlabel("Region")
plt.ylabel("Days to Sell")
plt.xticks(rotation=45)
plt.show()

In [None]:
# Let’s find out if certain times of the year are better for selling cars by 
# looking at the average days to sell across each month.

# Extract month name from SaleDate
df["MonthName"] = df["SaleDate"].dt.strftime("%B")

# Order months properly (not alphabetically)
month_order = ["January", "February", "March", "April", "May", "June",
               "July", "August", "September", "October", "November", "December"]

# Group and plot
monthly_avg = df.groupby("MonthName")["DaysToSell"].mean().reindex(month_order)

monthly_avg.plot(kind="bar", figsize=(10, 5), color="skyblue", edgecolor="black")
plt.title("Average Days to Sell by Month")
plt.xlabel("Month")
plt.ylabel("Avg. Days to Sell")
plt.xticks(rotation=45)
plt.grid(axis="y")
plt.show()

## Wrapping Up

In this notebook, we explored how to use Jupyter Notebooks as both a coding environment and a storytelling tool for data analysis. We:
- Defined a real-world question: **How long does it take to sell a vehicle after it's wholesaled?**
- Broke the analysis into clear, repeatable steps using separate cells
- Created a new calculated column to measure time-to-sale
- Built visualizations to uncover patterns across regions and months

This structure provides an example of notebooks: not just running code, but building a **narrative** — one that collaborators or future-you can understand, follow, and build upon. Additional steps can include adding markdown cells after the charts to describe observations. 

Remember, Markdown isn’t just for decoration — it's your tool for:
- Capturing insights alongside code
- Adding context to visualizations
- Guiding collaborators without needing to be in the same room

## Keep Learning

Here are a few great resources to help you go deeper with Jupyter:

- [Project Jupyter Official Site](https://jupyter.org)
- [Jupyter Notebook Beginner’s Guide – Real Python](https://realpython.com/jupyter-notebook-introduction/)
- [More Notebook Tutorials](https://jupyter.org/try-jupyter/notebooks/?path=notebooks/Intro.ipynb)

Thanks for taking a walk on Jupyter. Hope you found this helpful.  
**Reach out to us on DataDome if you have questions or need help!**