# 1. Getting Started 

## 1.1 Installing Python

Many PCs and Macs will have python already installed.

To check if you have python installed on a Windows PC, search in the start bar for Python or run the following on the Command Line (cmd.exe).

- C:\Users\Your Name>python --version

To check if you have python installed on a Linux or Mac, then on linux open the command line or on Mac open the Terminal and type :

- python --version

![image-5.png](attachment:image-5.png)

If you find that you do not have Python installed on your computer, then you can download it for free from the following website: https://www.python.org/

## 1.2 Quickstarting Python

Python is an interpreted programming language, this means that as a developer you write Python (.py) files in a text editor and then put those files into the python interpreter to be executed.

The way to run a python file is like this on the command line :

- C:\Users\Your Name>python helloworld.py

Where "helloworld.py" is the name of your python file.

Let's write our first Python file, called helloworld.py, which can be done in any text editor.

- helloworld.py

- print("Hello, World!")

Simple as that. Save your file. Open your command line, navigate to the directory where you saved your file, and run :

- C:\Users\Your Name>python helloworld.py

- The output should read: Hello, World!

**Congratulations, you have written and executed your first Python program.**

## 1.3 Python Command Line

To test a short amount of code in python sometimes it is quickest and easiest not to write the code in a file. This is made possible because Python can be run as a command line itself.

Type the following on the Windows, Mac or Linux command line :

- C:\Users\Your Name>python

Or, if the "python" command did not work, you can try "py" :

- C:\Users\Your Name>py

From there you can write any python, including our hello world example from earlier in the tutorial :

- print("Hello, World!")

- Hello, World!

Whenever you are done in the python command line, you can simply type the following to quit the python command line interface :

- exit()

## 1.4 Checking Version

In [1]:
import sys
print(sys.version)

3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)]


## 1.5 The Zen of Python

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


# 2. Anaconda (Python Distribution)

## 2.1 Introduction

**Anaconda is a distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.** 

The distribution includes data-science packages suitable for Windows, Linux, and macOS. It is developed and maintained by Anaconda, Inc., which was founded by Peter Wang and Travis Oliphant in 2012.

It comes with over 250 packages automatically installed, and over 7,500 additional open-source packages can be installed from PyPI as well as the conda package and virtual environment manager. It also includes a GUI, Anaconda Navigator, as a graphical alternative to the command-line interface (CLI).

## 2.2 Download and Install Anaconda Individual Edition

https://www.anaconda.com/products/individual

## 2.3 Verifying your Installation

You can confirm that Anaconda is installed and working with Anaconda Navigator or conda.

## 2.4 Anaconda Navigator

Anaconda Navigator is a graphical user interface that is automatically installed with Anaconda. Navigator will open if the installation was successful.

![image.png](attachment:image.png)

## 2.5 Anaconda Prompt

If you prefer using a command line interface (CLI), you can use conda to verify the installation using Anaconda Prompt on Windows or terminal on Linux and macOS.

![image-2.png](attachment:image-2.png)

# 3. Jupyter Notebook 

## 3.1 Introduction

Jupyter Notebook is an open-source, web-based interactive environment, which allows you to create and share documents that contain live code, mathematical equations, graphics, maps, plots, visualizations, and narrative text. 

It integrates with many programming languages like Python, PHP, R, C#, etc.

## 3.2 Advantages of Jupyter Notebook

There are the following advantages of Jupyter Notebook :

- **All in One Place :** As you know, Jupyter Notebook is an open-source web-based interactive environment that combines code, text, images, videos, mathematical equations, plots, maps, graphical user interface and widgets to a single document.

- **Easy to Convert :** Jupyter Notebook allows users to convert the notebooks into other formats such as HTML and PDF. It also uses online tools and nbviewer which allows you to render a publicly available notebook in the browser directly.

- **Easy to Share :** Jupyter Notebooks are saved in the structured text files (JSON format), which makes them easily shareable.

- **Language Independent :** Jupyter Notebook is platform-independent because it is represented as JSON (JavaScript Object Notation) format, which is a language-independent, text-based file format. Another reason is that the notebook can be processed by any programing language, and can be converted to any file formats such as Markdown, HTML, PDF, and others.

- **Interactive Code :** Jupyter notebook uses ipywidgets packages, which provide many common user interfaces for exploring code and data interactivity.

## 3.3 Disadvantages of Jupyter Notebook

There are the following disadvantages of Jupyter Notebook :

- It is very hard to test long asynchronous tasks.
- Less security.
- It runs cell out of order.
- In Jupyter notebook, there is no IDE integration, no linting, and no code-style correction.

## 3.4 Launch Jupyter Notebook

![image.png](attachment:image.png)

## 3.5 Dashboard

The below screenshot shows dashboard of the Jupyter Notebook which contains the three tabs.

![image.png](attachment:image.png)

## 3.6 Files Tab

The Files tab is used to display files and folders in the current directory. It also uses an Upload button through which a file can be uploaded to a notebook server.

![image.png](attachment:image.png)

## 3.7 Running Tab

The Running tab is used to show currently running notebooks.

![image.png](attachment:image.png)

## 3.8 Cluster Tab

IPython provides the Cluster Tab. IPython is a parallel computing framework, which is an extended version of the IPython kernel.

![image.png](attachment:image.png)

## 3.9 User Interface

When you create a new notebook, the notebook will be presented with the notebook name, menu bar, toolbar, and an empty code cell.

![image.png](attachment:image.png)

**Notebook Name :** Notebook name is displayed at the top of the page, next to the Jupyter logo.

**Menu Bar :** The menu bar presents different options that are used to manipulate the notebook functions.

**Toolbar :** The toolbar provides a quick way for performing the most-used operations within the notebook.

**Code Cell :** A code cell allows you to edit and write a new code.

## 3.10 Components of Jupyter Notebook

These are the following three components of Jupyter Notebook.

**The Notebook Web Application :** It is an interactive web application for writing and running the code.

The notebook web application allows users to :

- Edit code in the browser with automatic syntax highlighting and indentation.

- Run code on the browser.

- See results of computations with media representations, such as HTML, LaTex, png, pdf, etc.

- Create and use JavaScript widgets.

- Includes mathematical equations using Markdown cells.

**Kernels :** Kernels are the separate processes started by the notebook web application that is used to run a user's code in the given language and return output to the notebook web application.

In Jupyter notebook kernel is available in the following languages :

- Python

- Julia

- Ruby

- R

- Scala

- node.js

- Go

**Notebook Documents :** Notebook document contains a representation of all content which is visible in the notebook web application, including inputs and outputs of the computations, text, mathematical equations, graphs, and images.

## 3.11 Creating a Notebook

To create a Notebook in Jupyter, go to New and select Python3.

![image.png](attachment:image.png)

Now, you can see that a new notebook opens in a new tab.

![image-2.png](attachment:image-2.png)

## 3.12 Renaming the Notebook

To rename the Notebook, double click on the Untitled at the top of the screen. A pop up window will open to renaming the file. Enter a new notebook name that you want to add. Then click on the Rename.

![image.png](attachment:image.png)

## 3.13 How to Write and Run a Program

After renaming the file, click on the first cell in the notebook to enter in the edit mode. Now you can write the code in working area. After writing the code, you can run it by pressing the Shift+ Enter key or directly click on the run button at the top of the screen.

![image.png](attachment:image.png)

## 3.14 Types of Cells

There are the following four types of cells used in the Jupyter Notebook.

![image.png](attachment:image.png)

### 3.14.1 Code Cell

The contents present in a code cell is treated as statements in a programming language of the current kernel. By default, Jupyter notebook's kernel is in Python so you can write Python statements in a code cell. When you run the statement, its output is displayed below the code. Output can be presented in the form of text, image, matplotlib plots, or HTML tables.

![image-2.png](attachment:image-2.png)

### 3.14.2 Markdown Cell

Markdown cell provides documentation to the notebook and makes the notebook more attractive. This cell contains all types of formatting features such as making text bold and italic, headers, displaying ordered or unordered list, Bullet lists, Hyperlinks, tabular contents, images, etc.

To perform the following formatting features, first select Markdown cell from the drop-down menu.

![image-3.png](attachment:image-3.png)

**Bold and Italics :**

To make text bold, write text between the double underscores or double asterisks.

![image-4.png](attachment:image-4.png)

The following screenshot shows the output of the above code.

![image-5.png](attachment:image-5.png)

To make text italics, write text between single underscore or single asterisk.

![image-6.png](attachment:image-6.png)

The following screenshot shows the output of the above code.

![image-7.png](attachment:image-7.png)

**Headers :**

Creating headers in Markdown is quite similar to the creating headers in HTML. It displays text in 6 sizes. To make the text as a header, start the text using # symbol. The number of # symbols depends upon the size of the header.

For example -

Header 1 use one # symbol, header 2 use two # symbol, and so on.

![image.png](attachment:image.png)

The following screenshot shows the output of the above Header cells.

![image-2.png](attachment:image-2.png)

**Ordered Lists :**

The ordered list starts with 1. Use tab to make the suborder followed by the order.

![image.png](attachment:image.png)

The following screenshot shows the output of the above Markdown data.

![image-2.png](attachment:image-2.png)

**Bullet Lists :**

In Jupyter notebook, if text starts with the dash (-) symbol, markdown cell coverts dash into a solid circle and asterisk (*) to a solid square.

![image.png](attachment:image.png)

The following screenshot shows the output of the above Markdown data.

![image-2.png](attachment:image-2.png)

**Hyperlinks :**

Markdown cell allows you to attach the Hyperlink. To attach the hyperlink place the name of the link in square brackets [] and write link inside the parentheses ().

You can use the following code to insert the hyperlink.

![image.png](attachment:image.png)

Output:

![image-2.png](attachment:image-2.png)

**Table Content :**

Markdown cell allows you to create a table using pipe symbol (|) and dash symbol (-). Pipe symbol (|) is used for making columns, and dash symbol (-) is used for making the rows.

The table creation is shown below:

![image.png](attachment:image.png)

The following screenshot shows the table content of markdown cell.

![image-2.png](attachment:image-2.png)

**Images :**

To insert the image in a markdown cell, you first need to insert the image in the same directory. For this, go to Jupyter dashboard -> select Upload, specify the path of an image then click on Open.

![image.png](attachment:image.png)

Once the image is seen in the dashboard click on the Upload, you can see that image is uploaded in the dashboard.

![image-2.png](attachment:image-2.png)

Now, go to your current Notebook, and type the following code to insert the image.

![image-3.png](attachment:image-3.png)

The following screenshot shows that the image is inserted on the Notebook.

![image-4.png](attachment:image-4.png)

### 3.14.3 Raw NBConvert Cell

Raw NBConvert Cell provides a place where you can write output directly. These cells are not evaluated by the notebook kernel.

![image.png](attachment:image.png)

### 3.14.4 Heading Cell

The Jupyter Notebook does not support the heading cell. When you select the Heading from the drop-down menu, a pop will open on the screen which is shown in the below screenshot.

![image.png](attachment:image.png)

## 3.15 Enable Line Numbers for Jupyter Notebook Cells

![image.png](attachment:image.png)

## 3.16 Change your Jupyter Start-Up Folder

- Open cmd (or Anaconda Prompt) and run jupyter notebook --generate-config.

- This writes a file to C:\Users\username\.jupyter\jupyter_notebook_config.py.

- Browse to the file location and open it in an Editor.

- Search for the following line in the file: #c.NotebookApp.notebook_dir = ''.

- Replace by c.NotebookApp.notebook_dir = '/the/path/to/home/folder/'.

- Make sure you use forward slashes in your path and use /home/user/ instead of ~/ for your home directory, backslashes could be used if placed in double quotes even if folder name contains spaces as such : "D:\yourUserName\Any Folder\More Folders\".

- Remove the # at the beginning of the line to allow the line to execute.

## 3.17 Should Jupyter Notebooks be used in Production?

Jupyter Notebooks have been around for quite some time now. They’re used a lot in machine learning, mainly for experimentation and visualization. They were meant for prototyping and exploration, not for production. 

But over the years, the ecosystem has grown. Now there are different set of tools, JupyterLab, plugins, new kernels, and many others.

These changes came over the years thanks to:

- Experiments on the Cloud – Many people started preferring cloud for large computations and bigger datasets.
- Developer Workflow – Many machine learning teams started adopting software engineering practices like version control, git-flow, containerization, and more.
- Analysis to Production – If the analysis code is written keeping the best practices then it should be easily reused for production. 

**Pros and Cons of using Notebooks in Production :**

Pros :

- You can make standalone applications and dashboards using voilà and can serve the end-user.

- Jupyter notebooks can be scheduled as jobs over the cloud.

- You can make templatized notebooks and execute them via a papermill.
 
Cons :

- No proper code versioning.

- Reproducibility can be an issue because of state-dependent execution.

- Unit testing is difficult.

- Dependencies management is not proper.

- Caching is an issue.

- No CI/CD.

- The cons aren’t a huge limitation, as there are many ways to deal with them.

**Problems with Notebooks in Production :**

Jupyter is markdown-savvy. It uses base64 for its image serialization, and we get to use its functionality like code execution, all through a web interface.

But it comes with its own problems:

- Version control and file size

- Modularity and code reuse

- Hidden state

- Testing/debugging

**Version Control and File Size :**

Jupyter notebooks with the extension of .ipynb containing Python code aren’t Python files. They’re basically large JSON objects. They’re not very suitable for a Git-like workflow. If we commit the notebook after changes, the diff becomes really big, making it difficult to review them or to merge into the main branch. This makes it challenging to use them in teams.

If the notebooks include images and a lot of plots, then the file size increases considerably. 

Solutions :

- nbdime – it’s a tool that helps to generate different views of notebooks.

- nbstripout– it strips the output from the notebooks which can help us for easier parsing and comparison.

**Modularity and Code Reuse :**

Modularity is one of the most crucial concepts in creating robust applications.

Code modularity is important. But with notebooks, we put most of our codes into cells. The good way to reuse the code in Python is through functions and classes. Also, Notebooks don’t allow proper packaging.

Solutions :

- We can use the Do not Repeat Yourself (DRY) principle. You should generalize and consolidate your code as much as possible. 
- Functions, for example, should only have one job, abstracting your logic without over-engineering. However, you should keep an eye out for creating too many modules.

**Hidden State :**

Jupyter notebooks are an interface for writing and experimenting with code. But Jupyter has a weak spot. What you see is not always what you get.

Many people say that notebooks are good for reproducibility. And it’s true when you’re running the code in a linear order from start to finish. But we can also run cells in a non-linear order.

Jupyter runs code in the order that you execute it. And it remembers assignments whether or not they’re still there. The image below illustrates this.

The smaller box on the left represents the hidden state. This is code that you have already executed. In the next frame, we’ve deleted the variable, but it remains loaded in memory. This can lead to really weird situations.

![image.png](attachment:image.png)

Solutions :

- If your code is behaving strangely, a good first step is to restart the kernel.

- Write code with modularity and with linear order, it’s good for production.

**Testing/Debugging :**

Notebooks are hard to debug and test even if they are linear. That’s because of two reasons. Firstly, when you’re working on a project and the notebook grows large enough, there are too many things to keep track of (variables, functions, etc) and it gets difficult to figure out the execution flow.

The second reason is that it’s difficult to unit test because we can’t directly import functions that are defined in the notebook into a testing module. There are ways to do it, but it’s not straightforward.

Solutions :

- testbook – a unit testing framework that will help us test code inside the notebook.
- nbval, pytest-notebook – nbval is a great library for reproducible notebooks. It compares the stored outputs of a notebook with the outputs generated by a notebook. 

In the case of production notebooks, we want to encourage best practices and we want to avoid a lot of pitfalls and anti-patterns that have been mentioned in this section.

**Embracing Notebooks in Production :**

Notebooks are good when you’re just playing around and experimenting. But, as soon as you need to share your code or deploy a machine learning system into production, notebooks become quite challenging to work with.

We want a production notebook that’s testable in some form, deployable in some way, and extensible. 

Also, these notebooks are linearly executed notebooks. When we’re running the notebook in an automated way, we’re executing the notebook top-to-bottom one time. 

**Future of Notebooks in Production :**

- Notebooks as Voilà applications: Notebooks are increasingly becoming applications. Notebook is the point, it’s not just the way you get to whatever product you’re building. Instead, the notebook may be the product itself. As notebooks grow from dev environments to shareable applications, they become an end product themselves. 

- Data Science Platforms: There are many data science platforms, like Anaconda. They make notebooks a priority of their toolkit, helping and simplifying deployment.

- The rise of containers: Containers continue to expand their place in the data science ecosystem, and so notebooks are becoming a more practical tool for production deployments, even for serverless architectures like Lambda.

- New Jupyter capabilities: JupyterLab is really blurring the lines between production apps and many development tools even further – for example, by substituting extensions for traditional modules and packages.

**Final Thoughts :**

The use of notebooks in production is always a debatable topic. Many people believe and take it as undeniable truth saying Jupyter notebooks are just for experimenting and prototyping, but it may not be completely true. 

Notebooks are great tools for working with data, especially when leveraging open-source tools like papermill, airflow, or nbdev. Jupyter allows us to reliably execute notebooks in the production system.

## 3.18 Six Easy Ways to Run Jupyter Notebook in the Cloud

The following can be used to run Jupyter Notebook in the cloud :

- Binder
- Kaggle Kernels
- Google Colaboratory (Colab)
- Microsoft Azure Notebooks
- CoCalc
- Datalore

They don't require you to install anything on your local machine. They are completely free (or they have a free plan). They give you access to the Jupyter Notebook environment (or a Jupyter-like environment). 

They allow you to import and export notebooks using the standard .ipynb file format. They support the Python language (and most support other languages as well).