# Using Jupyter Notebooks
:label:`sec_jupyter`


This section describes how to edit and run the code
in each section of this book
using the Jupyter Notebook. Make sure you have
installed Jupyter and downloaded the
code as described in
:ref:`chap_installation`.
If you want to know more about Jupyter see the excellent tutorial in
their [documentation](https://jupyter.readthedocs.io/en/latest/).


## Editing and Running the Code Locally

Suppose that the local path of the book's code is `xx/yy/d2l-en/`. Use the shell to change the directory to this path (`cd xx/yy/d2l-en`) and run the command `jupyter notebook`. If your browser does not do this automatically, open http://localhost:8888 and you will see the interface of Jupyter and all the folders containing the code of the book, as shown in :numref:`fig_jupyter00`.

![The folders containing the code of this book.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter00.png?raw=1)
:width:`600px`
:label:`fig_jupyter00`


You can access the notebook files by clicking on the folder displayed on the webpage.
They usually have the suffix ".ipynb".
For the sake of brevity, we create a temporary "test.ipynb" file.
The content displayed after you click it is
shown in :numref:`fig_jupyter01`.
This notebook includes a markdown cell and a code cell. The content in the markdown cell includes "This Is a Title" and "This is text.".
The code cell contains two lines of Python code.

![Markdown and code cells in the "text.ipynb" file.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter01.png?raw=1)
:width:`600px`
:label:`fig_jupyter01`


Double click on the markdown cell to enter edit mode.
Add a new text string "Hello world." at the end of the cell, as shown in :numref:`fig_jupyter02`.

![Edit the markdown cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter02.png?raw=1)
:width:`600px`
:label:`fig_jupyter02`


As demonstrated in :numref:`fig_jupyter03`,
click "Cell" $\rightarrow$ "Run Cells" in the menu bar to run the edited cell.

![Run the cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter03.png?raw=1)
:width:`600px`
:label:`fig_jupyter03`

After running, the markdown cell is shown in :numref:`fig_jupyter04`.

![The markdown cell after running.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter04.png?raw=1)
:width:`600px`
:label:`fig_jupyter04`


Next, click on the code cell. Multiply the elements by 2 after the last line of code, as shown in :numref:`fig_jupyter05`.

![Edit the code cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter05.png?raw=1)
:width:`600px`
:label:`fig_jupyter05`


You can also run the cell with a shortcut ("Ctrl + Enter" by default) and obtain the output result from :numref:`fig_jupyter06`.

![Run the code cell to obtain the output.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter06.png?raw=1)
:width:`600px`
:label:`fig_jupyter06`


When a notebook contains more cells, we can click "Kernel" $\rightarrow$ "Restart & Run All" in the menu bar to run all the cells in the entire notebook. By clicking "Help" $\rightarrow$ "Edit Keyboard Shortcuts" in the menu bar, you can edit the shortcuts according to your preferences.

## Advanced Options

Beyond local editing two things are quite important: editing the notebooks in the markdown format and running Jupyter remotely.
The latter matters when we want to run the code on a faster server.
The former matters since Jupyter's native ipynb format stores a lot of auxiliary data that is
irrelevant to the content,
mostly related to how and where the code is run.
This is confusing for Git, making
reviewing contributions very difficult.
Fortunately there is an alternative---native editing in the markdown format.

### Markdown Files in Jupyter

If you wish to contribute to the content of this book, you need to modify the
source file (md file, not ipynb file) on GitHub.
Using the notedown plugin we
can modify notebooks in the md format directly in Jupyter.


First, install the notedown plugin, run the Jupyter Notebook, and load the plugin:

```
pip install d2l-notedown  # You may need to uninstall the original notedown.
jupyter notebook --NotebookApp.contents_manager_class='notedown.NotedownContentsManager'
```

You may also turn on the notedown plugin by default whenever you run the Jupyter Notebook.
First, generate a Jupyter Notebook configuration file (if it has already been generated, you can skip this step).

```
jupyter notebook --generate-config
```

Then, add the following line to the end of the Jupyter Notebook configuration file (for Linux or macOS, usually in the path `~/.jupyter/jupyter_notebook_config.py`):

```
c.NotebookApp.contents_manager_class = 'notedown.NotedownContentsManager'
```

After that, you only need to run the `jupyter notebook` command to turn on the notedown plugin by default.

### Running Jupyter Notebooks on a Remote Server

Sometimes, you may want to run Jupyter notebooks on a remote server and access it through a browser on your local computer. If Linux or macOS is installed on your local machine (Windows can also support this function through third-party software such as PuTTY), you can use port forwarding:

```
ssh myserver -L 8888:localhost:8888
```

The above string `myserver` is the address of the remote server.
Then we can use http://localhost:8888 to access the remote server `myserver` that runs Jupyter notebooks. We will detail on how to run Jupyter notebooks on AWS instances
later in this appendix.

### Timing

We can use the `ExecuteTime` plugin to time the execution of each code cell in Jupyter notebooks.
Use the following commands to install the plugin:

```
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime
```

## Summary

* Using the Jupyter Notebook tool, we can edit, run, and contribute to each section of the book.
* We can run Jupyter notebooks on remote servers using port forwarding.


## Exercises

1. Edit and run the code in this book with the Jupyter Notebook on your local machine.
1. Edit and run the code in this book with the Jupyter Notebook *remotely* via port forwarding.
1. Compare the running time of the operations $\mathbf{A}^\top \mathbf{B}$ and $\mathbf{A} \mathbf{B}$ for two square matrices in $\mathbb{R}^{1024 \times 1024}$. Which one is faster?


[Discussions](https://discuss.d2l.ai/t/421)


In [81]:
!pip install chromadb



In [82]:
import chromadb
import os
from google.colab import userdata

# Replace with your ChromaDB Cloud details
# You can store sensitive information like API keys in Colab Secrets.
# To do this, click on the "🔑" icon in the left sidebar, add a new secret,
# name it 'CHROMA_DB_CLOUD_API_KEY' and paste your API key as the value.
# Then you can access it using:
# from google.colab import userdata
# CHROMA_DB_CLOUD_API_KEY = userdata.get('CHROMA_DB_CLOUD_API_KEY')

# Assuming the user will provide the API key, tenant, and database directly in the code for now.
# In a real scenario, using Colab secrets for the API key is recommended.
CHROMA_DB_CLOUD_API_KEY = userdata.get('CHROMA_DB_CLOUD_API_KEY')
CHROMA_DB_CLOUD_TENANT = userdata.get('CHROMA_DB_CLOUD_TENANT') # Removed the extra single quote if it was present
CHROMA_DB_CLOUD_DATABASE = userdata.get('CHROMA_DB_CLOUD_DATABASE')


client = chromadb.CloudClient(
  api_key=CHROMA_DB_CLOUD_API_KEY,
  tenant=CHROMA_DB_CLOUD_TENANT,
  database=CHROMA_DB_CLOUD_DATABASE
)

# You can uncomment the line below to check if the connection is successful
# print(client.heartbeat())

In [83]:
from google.colab import userdata
from IPython.display import display

try:
    tenant_secret_value = userdata.get('CHROMA_DB_CLOUD_TENANT')
    print("Value of CHROMA_DB_CLOUD_TENANT secret:")
    display(tenant_secret_value)
    print("\nPlease carefully inspect the output above for any leading/trailing spaces or other unexpected characters.")
except userdata.SecretNotFoundError:
    print("CHROMA_DB_CLOUD_TENANT secret not found. Please make sure you have added it to Colab Secrets.")
except Exception as e:
    print(f"An error occurred while retrieving the secret: {e}")

Value of CHROMA_DB_CLOUD_TENANT secret:


'67b449aa-455f-4bf3-8623-d84aa13916fd'


Please carefully inspect the output above for any leading/trailing spaces or other unexpected characters.


In [84]:
# Assuming 'collection' is the ChromaDB collection object
query_results_risks = collection.query(
    query_texts=["potential risks"],
    n_results=3  # Adjust as needed
)

print("Query Results for 'potential risks':")
print(query_results_risks)

Query Results for 'potential risks':
{'ids': [['demoblaze_brd_document']], 'distances': [[1.8914368]], 'embeddings': None, 'metadatas': [[None]], 'documents': [['Sure! Below is your Business Requirements Document (BRD) for the Demoblaze Online Store Enhancement / Redesign project, but with sample data filled in where there were originally blank placeholders such as date, name, etc.\n\nBusiness Requirements Document (BRD)\nProject: Demoblaze Online Store Enhancement / Redesign\nVersion: 1.0\nDate: September 16, 2025\nAuthor: Priya Sharma\nStakeholders: Product Owner, Marketing, Development Team, UX/UI Team, Customer Support\n\n1. Executive Summary\nDemoblaze is an online demo/product store featuring electronic items such as phones, laptops, monitors; it includes browsing, signup/login, product listings, cart, contact us, and about us pages. The goal is to clearly capture business requirements for enhancements/new features, improve user experience, ensure performance, scalability, and su

In [85]:
!pip install python-docx



In [86]:
import docx

def read_docx(file_path):
    """Reads text from a .docx file."""
    doc = docx.Document(file_path)
    text = []
    for paragraph in doc.paragraphs:
        text.append(paragraph.text)
    return "\n".join(text)

# Replace with the actual path to your document
document_path = '/content/sample_data/content/DemoBlaze BRD.docx'
document_content = read_docx(document_path)

# You can print the first 500 characters to verify
# print(document_content[:500])

In [87]:
collection_name = "demoblaze_brd"  # You can choose a descriptive name

try:
    # Try to get the collection if it already exists
    collection = client.get_collection(name=collection_name)
    print(f"Collection '{collection_name}' already exists.")
except:
    # If the collection doesn't exist, create it
    collection = client.create_collection(name=collection_name)
    print(f"Collection '{collection_name}' created.")

Collection 'demoblaze_brd' already exists.


In [88]:
# Assuming 'document_content' contains the text read from the docx file
# and 'collection' is the ChromaDB collection object

# ChromaDB requires a list of documents and a list of unique IDs for each document.
# Since we have a single document, we'll create lists with one element.
documents = [document_content]
ids = ["demoblaze_brd_document"] # You can use a more descriptive or unique ID

collection.add(
    documents=documents,
    ids=ids
)

print(f"Document inserted into collection '{collection_name}'.")

Document inserted into collection 'demoblaze_brd'.


In [89]:
# Assuming 'collection' is the ChromaDB collection object

# To verify insertion, we can retrieve the document by its ID
retrieved_document = collection.get(ids=["demoblaze_brd_document"])

# Print the retrieved document to verify
print("Retrieved Document:")
print(retrieved_document)

# You can also count the number of items in the collection to verify
count = collection.count()
print(f"\nNumber of items in the collection: {count}")

# You can also query the collection with a sample query to see relevant results
# query_results = collection.query(
#     query_texts=["What is the purpose of this document?"],
#     n_results=1
# )
# print("\nQuery Results:")
# print(query_results)

Retrieved Document:
{'ids': ['demoblaze_brd_document'], 'embeddings': None, 'metadatas': [None], 'documents': ['Sure! Below is your Business Requirements Document (BRD) for the Demoblaze Online Store Enhancement / Redesign project, but with sample data filled in where there were originally blank placeholders such as date, name, etc.\n\nBusiness Requirements Document (BRD)\nProject: Demoblaze Online Store Enhancement / Redesign\nVersion: 1.0\nDate: September 16, 2025\nAuthor: Priya Sharma\nStakeholders: Product Owner, Marketing, Development Team, UX/UI Team, Customer Support\n\n1. Executive Summary\nDemoblaze is an online demo/product store featuring electronic items such as phones, laptops, monitors; it includes browsing, signup/login, product listings, cart, contact us, and about us pages. The goal is to clearly capture business requirements for enhancements/new features, improve user experience, ensure performance, scalability, and support business growth.\n\n2. Business Objectives\n

In [90]:
# Assuming 'collection' is the ChromaDB collection object
query_results = collection.query(
    query_texts=["project objectives"],
    n_results=3  # You can adjust the number of results as needed
)

print("Query Results for 'project objectives':")
print(query_results)

Query Results for 'project objectives':
{'ids': [['demoblaze_brd_document']], 'distances': [[1.4586914]], 'embeddings': None, 'metadatas': [[None]], 'documents': [['Sure! Below is your Business Requirements Document (BRD) for the Demoblaze Online Store Enhancement / Redesign project, but with sample data filled in where there were originally blank placeholders such as date, name, etc.\n\nBusiness Requirements Document (BRD)\nProject: Demoblaze Online Store Enhancement / Redesign\nVersion: 1.0\nDate: September 16, 2025\nAuthor: Priya Sharma\nStakeholders: Product Owner, Marketing, Development Team, UX/UI Team, Customer Support\n\n1. Executive Summary\nDemoblaze is an online demo/product store featuring electronic items such as phones, laptops, monitors; it includes browsing, signup/login, product listings, cart, contact us, and about us pages. The goal is to clearly capture business requirements for enhancements/new features, improve user experience, ensure performance, scalability, and

In [91]:
# Assuming 'query_results' contains the results from the previous query

# Extract the document content from the query results
# The structure of query_results is a dictionary, and the document content is in the 'documents' key
# 'documents' is a list of lists, where each inner list contains the document text for a given query text
if query_results and 'documents' in query_results and query_results['documents']:
    # Since we queried with a single text, we expect one inner list of documents
    retrieved_documents_content = query_results['documents'][0]

    print("Key Project Objectives:")
    # In this case, the entire document was returned as the most relevant result.
    # We can now manually or programmatically identify the "Business Objectives" section.
    # A simple approach for this specific document structure is to look for the section header.

    objectives_section_start = document_content.find("2. Business Objectives")
    if objectives_section_start != -1:
        # Find the end of the objectives section (e.g., start of the next section)
        scope_section_start = document_content.find("3. Scope", objectives_section_start)
        if scope_section_start != -1:
            project_objectives = document_content[objectives_section_start:scope_section_start].strip()
            print(project_objectives)
        else:
            # If '3. Scope' is not found, take the rest of the document from the objectives start
            project_objectives = document_content[objectives_section_start:].strip()
            print(project_objectives)
    else:
        print("Could not find the 'Business Objectives' section in the retrieved document.")

else:
    print("No documents found in the query results to extract objectives from.")

Key Project Objectives:
2. Business Objectives
Increase conversion rate (visitors → customers) by improving site usability.
Improve user retention by providing better account management and after-sales support.
Enhance product discovery (search, categories, filtering).
Ensure site is mobile-friendly / responsive.
Reduce cart abandonment.
Improve performance (site speed, page load).
Enhance credibility & trust (reviews, detailed product info, security).
Provide scalable and maintainable architecture.
