<a href="https://colab.research.google.com/github/ibrahim-741/Perfect-Roadmap-To-Learn-Data-Science-In-2024/blob/main/chapter_appendix-tools-for-deep-learning/jupyter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Jupyter Notebooks
:label:`sec_jupyter`


This section describes how to edit and run the code
in each section of this book
using the Jupyter Notebook. Make sure you have
installed Jupyter and downloaded the
code as described in
:ref:`chap_installation`.
If you want to know more about Jupyter see the excellent tutorial in
their [documentation](https://jupyter.readthedocs.io/en/latest/).


## Editing and Running the Code Locally

Suppose that the local path of the book's code is `xx/yy/d2l-en/`. Use the shell to change the directory to this path (`cd xx/yy/d2l-en`) and run the command `jupyter notebook`. If your browser does not do this automatically, open http://localhost:8888 and you will see the interface of Jupyter and all the folders containing the code of the book, as shown in :numref:`fig_jupyter00`.

![The folders containing the code of this book.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter00.png?raw=1)
:width:`600px`
:label:`fig_jupyter00`


You can access the notebook files by clicking on the folder displayed on the webpage.
They usually have the suffix ".ipynb".
For the sake of brevity, we create a temporary "test.ipynb" file.
The content displayed after you click it is
shown in :numref:`fig_jupyter01`.
This notebook includes a markdown cell and a code cell. The content in the markdown cell includes "This Is a Title" and "This is text.".
The code cell contains two lines of Python code.

![Markdown and code cells in the "text.ipynb" file.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter01.png?raw=1)
:width:`600px`
:label:`fig_jupyter01`


Double click on the markdown cell to enter edit mode.
Add a new text string "Hello world." at the end of the cell, as shown in :numref:`fig_jupyter02`.

![Edit the markdown cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter02.png?raw=1)
:width:`600px`
:label:`fig_jupyter02`


As demonstrated in :numref:`fig_jupyter03`,
click "Cell" $\rightarrow$ "Run Cells" in the menu bar to run the edited cell.

![Run the cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter03.png?raw=1)
:width:`600px`
:label:`fig_jupyter03`

After running, the markdown cell is shown in :numref:`fig_jupyter04`.

![The markdown cell after running.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter04.png?raw=1)
:width:`600px`
:label:`fig_jupyter04`


Next, click on the code cell. Multiply the elements by 2 after the last line of code, as shown in :numref:`fig_jupyter05`.

![Edit the code cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter05.png?raw=1)
:width:`600px`
:label:`fig_jupyter05`


You can also run the cell with a shortcut ("Ctrl + Enter" by default) and obtain the output result from :numref:`fig_jupyter06`.

![Run the code cell to obtain the output.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter06.png?raw=1)
:width:`600px`
:label:`fig_jupyter06`


When a notebook contains more cells, we can click "Kernel" $\rightarrow$ "Restart & Run All" in the menu bar to run all the cells in the entire notebook. By clicking "Help" $\rightarrow$ "Edit Keyboard Shortcuts" in the menu bar, you can edit the shortcuts according to your preferences.

## Advanced Options

Beyond local editing two things are quite important: editing the notebooks in the markdown format and running Jupyter remotely.
The latter matters when we want to run the code on a faster server.
The former matters since Jupyter's native ipynb format stores a lot of auxiliary data that is
irrelevant to the content,
mostly related to how and where the code is run.
This is confusing for Git, making
reviewing contributions very difficult.
Fortunately there is an alternative---native editing in the markdown format.

### Markdown Files in Jupyter

If you wish to contribute to the content of this book, you need to modify the
source file (md file, not ipynb file) on GitHub.
Using the notedown plugin we
can modify notebooks in the md format directly in Jupyter.


First, install the notedown plugin, run the Jupyter Notebook, and load the plugin:

```
pip install d2l-notedown  # You may need to uninstall the original notedown.
jupyter notebook --NotebookApp.contents_manager_class='notedown.NotedownContentsManager'
```

You may also turn on the notedown plugin by default whenever you run the Jupyter Notebook.
First, generate a Jupyter Notebook configuration file (if it has already been generated, you can skip this step).

```
jupyter notebook --generate-config
```

Then, add the following line to the end of the Jupyter Notebook configuration file (for Linux or macOS, usually in the path `~/.jupyter/jupyter_notebook_config.py`):

```
c.NotebookApp.contents_manager_class = 'notedown.NotedownContentsManager'
```

After that, you only need to run the `jupyter notebook` command to turn on the notedown plugin by default.

### Running Jupyter Notebooks on a Remote Server

Sometimes, you may want to run Jupyter notebooks on a remote server and access it through a browser on your local computer. If Linux or macOS is installed on your local machine (Windows can also support this function through third-party software such as PuTTY), you can use port forwarding:

```
ssh myserver -L 8888:localhost:8888
```

The above string `myserver` is the address of the remote server.
Then we can use http://localhost:8888 to access the remote server `myserver` that runs Jupyter notebooks. We will detail on how to run Jupyter notebooks on AWS instances
later in this appendix.

### Timing

We can use the `ExecuteTime` plugin to time the execution of each code cell in Jupyter notebooks.
Use the following commands to install the plugin:

```
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime
```

## Summary

* Using the Jupyter Notebook tool, we can edit, run, and contribute to each section of the book.
* We can run Jupyter notebooks on remote servers using port forwarding.


## Exercises

1. Edit and run the code in this book with the Jupyter Notebook on your local machine.
1. Edit and run the code in this book with the Jupyter Notebook *remotely* via port forwarding.
1. Compare the running time of the operations $\mathbf{A}^\top \mathbf{B}$ and $\mathbf{A} \mathbf{B}$ for two square matrices in $\mathbb{R}^{1024 \times 1024}$. Which one is faster?


[Discussions](https://discuss.d2l.ai/t/421)


In [None]:
# lets solve assignments now
student_project/
    students.csv
    main.py
# project structure

In [2]:
# lets create CSV with headers and records(bulk addition)
import csv
header = ["id", "name", "subject", "marks"]
records = [
    {"id": 1, "name": "Amit", "subject": "Math", "marks": 89},
    {"id": 2, "name": "Riya", "subject": "Science", "marks": 92},
    {"id": 3, "name": "John", "subject": "English", "marks": 76}
]
with open('students.csv','w',newline = '') as fp:
  writer = csv.DictWriter(fp,fieldnames = header)
  writer.writeheader()
  writer.writerows(records)

In [None]:
# this function will return total count of records present in the list(simple max_id)

In [5]:
# lets add some records manually
def add_student(id,name,subject,marks):
  '''It will add the records manually'''
  with open('students.csv','a',newline = '') as fp:
    writer = csv.DictWriter(fp, fieldnames=["id", "name", "subject", "marks"])
    writer.writerow({
            "id": id,
            "name": name,
            "subject": subject,
            "marks": marks
        })
  clean_data()

In [6]:
add_student(3,"Shaik Bushra Tabassum","Social",99)

In [9]:
def view_students():
    with open("students.csv", mode="r", newline="") as f:
        reader = list(csv.DictReader(f))
        print("\nID | Name | Subject | Marks")
        print("--------------------------------")
        for row in reader:
            print(f"{row['id']} | {row['name']} | {row['subject']} | {row['marks']}")

In [10]:
view_students()


ID | Name | Subject | Marks
--------------------------------
1 | Amit | Math | 89
2 | Riya | Science | 92
3 | John | English | 76
3 | Shaik Bushra Tabassum | Social | 99


In [None]:
# the bug is i can add all enteries with duplicacy this should not happen !

In [13]:
# lets search by a student
def search_student(name):
  with open('students.csv','r',newline = '') as fp:
    reader = list(csv.DictReader(fp))  # locally i am creating the data and using that
    for row in reader:
      if row['name'].lower() == name.lower():
        return row
    return "Name Does not exist"

In [14]:
search_student("Shaik Bushra Tabassum")

{'id': '3',
 'name': 'Shaik Bushra Tabassum',
 'subject': 'Social',
 'marks': '99'}

In [None]:
# lets clean the data bro
# def clean_data(row):
#   with open("students.csv","w",newline = '') as fp:
#     reader = list(csv.DictReader(fp))
#     for row in reader:
#       row[]

In [34]:
# Imp part updation
# 1.read all the rows ina list
# 2.modify the data into list
# 3.Rewrite the csv using 'w'

# we are updating there marks

def update_marks(id,new_marks):

  students = [] # here we will insert the csv data
  with open("students.csv",'r',newline = '') as fp:
    reader = csv.DictReader(fp)
    students = list(reader)

  # lets update the marks
  updated = False
  for s in students:
    if s['id'] == str(id):
      s['marks'] = str(new_marks)
      updated = True
      break
  if not updated:
    print("Student not found")
    return
  # lets rewrite the file content i mean the whole csv
  with open("students.csv","w",newline = "") as fp:
    header = ["id","name","subject","marks"]
    writer = csv.DictWriter(fp,fieldnames=header)
    writer.writeheader()
    writer.writerows(students)
  print("Updated successfully!")
  clean_data()
  return



In [23]:
update_marks(1,100)

Updated successfully!


In [24]:
view_students()


ID | Name | Subject | Marks
--------------------------------
1 | Amit | Math | 100
2 | Riya | Science | 92
3 | John | English | 76
3 | Shaik Bushra Tabassum | Social | 99


In [30]:
# lets clean the data
def clean_data():
  clean_students = []  # this makes the fresh data available through out the function
  with open("students.csv","r",newline = "") as fp:
    clean_students = list(csv.DictReader(fp))
    for row in clean_students:
      row['id'] = int(row['id'])
      row['marks'] = int(row['marks'])

  # lets update this into present file this we need to do this again and again

  with open("students.csv","w",newline = "") as fp:
    header = ["id","name","subject","marks"]
    writer = csv.DictWriter(fp,fieldnames=header)
    writer.writeheader()
    writer.writerows(clean_students)
  return clean_students



In [32]:
print(clean_data())

[{'id': 1, 'name': 'Amit', 'subject': 'Math', 'marks': 100}, {'id': 2, 'name': 'Riya', 'subject': 'Science', 'marks': 92}, {'id': 3, 'name': 'John', 'subject': 'English', 'marks': 76}, {'id': 3, 'name': 'Shaik Bushra Tabassum', 'subject': 'Social', 'marks': 99}]


In [33]:
view_students()


ID | Name | Subject | Marks
--------------------------------
1 | Amit | Math | 100
2 | Riya | Science | 92
3 | John | English | 76
3 | Shaik Bushra Tabassum | Social | 99


In [39]:
# ‚≠ê Step 6 ‚Äî Calculate Class Average

def class_average():
  total_marks = 0
  count = 0

  with open("students.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp)) # we can use this reader data anyehere inside the function

  for row in reader:
    total_marks += int(row['marks'])
    count+=1
  return total_marks/count

In [40]:
print(class_average())

91.75


In [41]:
# ‚≠ê Step 7 ‚Äî Export Top Students (Marks > 90)

# in a new file list out all the toppers


ID | Name | Subject | Marks
--------------------------------
1 | Amit | Math | 100
2 | Riya | Science | 92
3 | John | English | 76
3 | Shaik Bushra Tabassum | Social | 99


In [None]:
# project structure
# expense_tracker/
#     expenses.csv
#     app.py


In [85]:
# lets write the header
import csv

with open("expenses.csv","w",newline = "") as fp:
  writer = csv.writer(fp)
  writer.writerow(["id", "date", "category", "description", "amount"])



In [86]:
def bulk_insert(records):
    fieldnames = ["id", "date", "category", "description", "amount"]

    with open("expenses.csv", mode="a", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)

        writer.writerows(records)


In [87]:
records = [
    {"id": "1", "date": "2025-11-18", "category": "Food", "description": "Pizza", "amount": 250},
    {"id": "2", "date": "2025-11-18", "category": "Travel", "description": "Uber", "amount": 120}
]

bulk_insert(records)


In [88]:
# lets add single expenses
def add_expenses(exp_id,date,category,description,amount):
  with open("expenses.csv","a",newline = "") as fp:
    writer = csv.DictWriter(fp,fieldnames=["id","date","category","description","amount"])
    writer.writerow({
        "id": exp_id,
        "date": date,
        "category": category,
        "description": description,
        "amount": amount
    })
  print("Record inserted!")

In [89]:
add_expenses("3","2025-11-18","Travel","Ola",500)

Record inserted!


In [90]:
# ‚≠ê Step 4 ‚Äî View All Expenses
def view_expenses():
    with open("expenses.csv", "r") as f:
        reader = csv.DictReader(f)
        print("\nID | Date | Category | Description | Amount")
        print("-" * 60)
        for row in reader:
            print(f"{row['id']} | {row['date']} | {row['category']} | {row['description']} | {row['amount']}")


In [91]:
view_expenses()


ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 250
2 | 2025-11-18 | Travel | Uber | 120
3 | 2025-11-18 | Travel | Ola | 500


In [None]:
# here there are some problems
# if we add another entery duplicate id's are created
# id are not in integer formatt


In [71]:
# üéØ Summary
# CSV ALWAYS loads values as strings

# ‚Üí That‚Äôs why you think data isn't cleaned.

# Your cleaning code is correct

# ‚Üí But saving back to CSV turns ints into strings (normal!)

# Correct way

# Use a clean reader function to convert types every time you read.

In [92]:
import os
def get_max_id():
  if not os.path.exists("expenses.csv"):
   return 0
  with open("expenses.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp))
    max_id = 0
    for row in reader:
      max_id = max(max_id,int(row["id"]))
    return max_id


In [93]:
get_max_id()

3

In [94]:
# lets handle duplicate enteries

def add_expenses(date,category,description,amount):
  max_id = get_max_id()
  with open("expenses.csv","a",newline = "") as fp:
    writer = csv.DictWriter(fp,fieldnames=["id","date","category","description","amount"])
    writer.writerow({
        "id": max_id + 1,
        "date": date,
        "category": category,
        "description": description,
        "amount": amount
    })
  print("Record inserted!")

In [95]:
add_expenses("2025-11-18","Travel","Ola",500)

Record inserted!
Data cleaned!


In [96]:
# lets get this duplicate enteries rid
view_expenses()


ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 250
2 | 2025-11-18 | Travel | Uber | 120
3 | 2025-11-18 | Travel | Ola | 500
4 | 2025-11-18 | Travel | Ola | 500


In [100]:
# most important updating an expenses
def update_expense(exp_id:int,new_amount:int):

  # remember csv files always takes it as string
   # lets access the data
  updated = False
  with open("expenses.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp))
  for row in reader:
    if int(row['id']) == exp_id:
      row['amount'] = str(new_amount)
      print("Updated!")
      updated = True
      break
  if not updated:
    print("Expense not found")
# noe lets rewrite into file
  with open("expenses.csv","w",newline = "") as fp:
    header = ["id","date","category","description","amount"]
    writer = csv.DictWriter(fp,fieldnames=header)
    writer.writeheader()
    writer.writerows(reader)



In [101]:
update_expense(1,500)

Updated!


In [102]:
view_expenses()


ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 500
2 | 2025-11-18 | Travel | Uber | 120
3 | 2025-11-18 | Travel | Ola | 500
4 | 2025-11-18 | Travel | Ola | 500


In [103]:
# ‚≠ê Step 8 ‚Äî Delete an Expense
def delete_expense(exp_id:int)->str:

  with open("expenses.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp))
  # easiest way to remove is ignore
  data = [row for row in reader if int(row['id']) != exp_id]

  # what if the exp_id is present or not?
  if len(data) == len(reader):
    print("Expense not found")
    return
  # lets update now
  with open("expenses.csv","w",newline = "") as fp:
    header = ["id","date","category","description","amount"]
    writer = csv.DictWriter(fp,fieldnames=header)
    writer.writeheader()
    writer.writerows(data)
  print("deleted!")

In [104]:
view_expenses()


ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 500
2 | 2025-11-18 | Travel | Uber | 120
3 | 2025-11-18 | Travel | Ola | 500
4 | 2025-11-18 | Travel | Ola | 500


In [105]:
delete_expense(4)

In [106]:
view_expenses()


ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 500
2 | 2025-11-18 | Travel | Uber | 120
3 | 2025-11-18 | Travel | Ola | 500


In [107]:
# ‚≠ê Step 9 ‚Äî Calculate Monthly Total
def monthly_totals(month):
  total_amount = 0
  with open("expenses.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp))
  for row in reader:
    if row['date'].startswith(month):
      total_amount += int(row['amount'])
  return total_amount


In [109]:
monthly_totals("2024-11")

0

In [113]:
def search_expenses(key:int):
  with open("expenses.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp))
  for row in reader:
    if int(row['id']) == key:
      return row
  return "Key not found"

search_expenses(2)

{'id': '2',
 'date': '2025-11-18',
 'category': 'Travel',
 'description': 'Uber',
 'amount': '120'}

In [117]:
def filter_by_category(category):
  with open("expenses.csv","r",newline = "") as fp:
    reader = list(csv.DictReader(fp))
  for row in reader:
    if row['category'].lower() == category.lower():
      return row
  return "Unknown category"

filter_by_category("travel")

{'id': '2',
 'date': '2025-11-18',
 'category': 'Travel',
 'description': 'Uber',
 'amount': '120'}

In [None]:
# the best part
while True:
    print("\n--- Expense Tracker ---")
    print("1. Add Expense")
    print("2. View Expenses")
    print("3. Search Expense")
    print("4. Filter by Category")
    print("5. Update Expense")
    print("6. Delete Expense")
    print("7. Monthly Total")
    print("8. Bulk Insert")
    print("9. Exit")

    choice = input("Enter choice: ")

    if choice == "1":
        d = input("Date (YYYY-MM-DD): ")
        c = input("Category: ")
        ds = input("Description: ")
        a = input("Amount: ")
        add_expenses(d, c, ds, a)

    elif choice == "2":
        view_expenses()

    elif choice == "3":
        k = input("Keyword: ")
        print(search_expenses(k))

    elif choice == "4":
        c = input("Category: ")
        print(filter_by_category(c))

    elif choice == "5":
        i = int(input("Expense ID: "))
        a = int(input("New Amount: "))
        update_expense(i, a)

    elif choice == "6":
        i = int(input("Expense ID: "))
        delete_expense(i)

    elif choice == "7":
        m = input("Enter month (YYYY-MM): ")
        print("Total:", monthly_totals(m))

    elif choice == "8":
        print("Enter bulk records manually in code for now.")
        bulk_insert(records)

    elif choice == "9":
        break

    else:
        print("Invalid choice!")



--- Expense Tracker ---
1. Add Expense
2. View Expenses
3. Search Expense
4. Filter by Category
5. Update Expense
6. Delete Expense
7. Monthly Total
8. Bulk Insert
9. Exit
Enter choice: 2

ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 500
2 | 2025-11-18 | Travel | Uber | 120
3 | 2025-11-18 | Travel | Ola | 500

--- Expense Tracker ---
1. Add Expense
2. View Expenses
3. Search Expense
4. Filter by Category
5. Update Expense
6. Delete Expense
7. Monthly Total
8. Bulk Insert
9. Exit
Enter choice: 6
Expense ID: 3

--- Expense Tracker ---
1. Add Expense
2. View Expenses
3. Search Expense
4. Filter by Category
5. Update Expense
6. Delete Expense
7. Monthly Total
8. Bulk Insert
9. Exit
Enter choice: 2

ID | Date | Category | Description | Amount
------------------------------------------------------------
1 | 2025-11-18 | Food | Pizza | 500
2 | 2025-11-18 | Travel | Uber | 120

--- Expense Tracker ---


In [None]:
# project 2
# library/
#     library.csv
#     app.py
