# Introduction to Jupyter Notebooks and programming in Python
*Converted from the NU Stat “Introduction to programming for data science” web chapter into a Jupyter notebook (Markdown + code cells), for use in **JupyterLab** or **VS Code**.*

---


## 1 Introduction to Jupyter Notebooks and programming in python

This chapter is a very brief introduction to python and Jupyter notebooks. We only discuss the content relevant for applying python to analyze data.


## 1.1 Installation

**Anaconda:** If you are new to python, we recommend downloading the Anaconda installer and following the instructions for installation. Once installed, we’ll use the Jupyter Notebook interface to write code.


## 1.2 Jupyter notebook


### 1.2.1 Introduction

Jupyter notebook is an interactive platform, where you can write code and text, and make visualizations. You can access Jupyter notebook from the Anaconda Navigator, or directly open the Jupyter Notebook application itself. It should automatically open up in your default browser. The figure below shows a Jupyter Notebook opened with Google Chrome. This page is called the landing page of the notebook.

![Jupyter landing page](intro_jupyter_assets/jupyter.jpg)

To create a new notebook, click on the `New` button and select the `Python 3` option. You should see a blank notebook as in the figure below.

![New notebook](intro_jupyter_assets/jupyter_newbook.jpg)


### 1.2.2 Writing and executing code

**Code cell:** By default, a cell is of type *Code*, i.e., for typing code, as seen as the default choice in the dropdown menu. Try typing a line of python code (say, `2+3`) in an empty code cell and execute it by pressing **Shift+Enter**. This should execute the code, and create a new code cell.

- **Ctrl+Enter** (Windows/Linux) or **Cmd+Enter** (Mac) executes the current cell **without** creating a new one.

**Commenting code in a code cell:** Comments should be made while writing the code to explain the purpose of the code or a brief explanation of the tasks being performed by the code. A comment can be added in a code cell by preceding it with a `#` sign.


In [None]:
# This code adds 3 and 5
3 + 5


Writing comments will help other users understand your code. It is also useful for the coder to keep track of the tasks being performed by their code.

**Markdown cell:** Although a comment can be written in a code cell, a code cell cannot be used for writing headings/sub-headings, and is not appropriate for writing lengthy chunks of text. In such cases, change the cell type to *Markdown* from the dropdown menu.

Give a name to the notebook by clicking on the text, which says ‘Untitled’.


### 1.2.3 Saving and loading notebooks

Save the notebook by clicking on `File`, and selecting `Save as`, or clicking on the `Save and Checkpoint` icon. Your notebook will be saved as a file with an extension `.ipynb`. This file will contain all the code as well as the outputs, and can be loaded and edited by a Jupyter user.

To load an existing Jupyter notebook, navigate to the folder of the notebook on the landing page, and then click on the file to open it.


### 1.2.4 Rendering notebook as HTML (without Quarto)

The original material mentions Quarto for rendering notebooks as HTML. In this course, we will **not** use Quarto. Instead, use one of the following:

#### Option A: Export from JupyterLab
- `File → Save and Export Notebook As… → HTML` (menu text may vary slightly by version)

#### Option B: Use `nbconvert` from a terminal
From the folder containing your notebook:
```bash
python -m nbconvert --to html your_notebook.ipynb
```

If you want a self-contained HTML with embedded images and outputs, try:
```bash
python -m nbconvert --to html --embed-images your_notebook.ipynb
```
*(Exact flags can vary with nbconvert versions.)*


## 1.3 In-class exercise

1. Create a new notebook.
2. Save the file as `In_class_exercise_1`.
3. Give a heading to the file - `First HTML file`.
4. Print `Today is day 1 of my programming course`.
5. Compute and print the number of seconds in a day.

The HTML file should look like the picture below.

![Example HTML render](intro_jupyter_assets/snapshot.jpg)


## 1.4 Python libraries

There are several built-in functions in python like `print()`, `abs()`, `max()`, `sum()` etc., which do not require importing any library. However, these functions will typically be insufficient for analyzing data. Some of the popular libraries in data science and their primary purposes are as follows:

1. **NumPy:** Performing numerical operations and efficiently storing numerical data.
2. **Pandas:** Reading, cleaning and manipulating data.
3. **Matplotlib, Seaborn:** Visualizing data.
4. **SciPy:** Performing scientific computing such as solving differential equations, optimization, statistical tests, etc.
5. **Scikit-learn:** Data pre-processing and machine learning, with a focus on prediction.
6. **Statsmodels:** Developing statistical models with a focus on inference.

A library can be imported using the `import` keyword. For example, a NumPy library can be imported as:


In [None]:
import numpy as np


Using the `as` keyword, the NumPy library has been given the name `np`. All the functions and attributes of the library can be called using the `np.` prefix. For example, let us generate a sequence of whole numbers up to `8` using the NumPy function `arange()`:


In [None]:
np.arange(8)


Generating random numbers is very useful in python for performing simulations. The library `random` is used to generate random numbers such as integers, real numbers based on different probability distributions, etc.

Below is an example of using the `randint()` function of the library for generating random numbers in `[a, b]`, where `a` and `b` are integers.


In [None]:
import random as rm
rm.randint(5, 10)  # This will generate a random number in [5,10]


## 1.5 Debugging and errors

Read sections **1.3 - 1.6** from:
http://openbookproject.net/thinkcs/python/english3e/way_of_the_program.html


## 1.6 Terms used in programming

Read section **1.11** from:
http://openbookproject.net/thinkcs/python/english3e/way_of_the_program.html


---
## Source

Converted from:
https://nustat.github.io/Intro_to_programming_for_data_sci/Introduction%20to%20Jupyter%20Notebooks%20and%20programming%20in%20python.html
