# Introduction to Jupyter Notebook

Jupyter Notebook is an open-source web-based application that allows you to create and share documents containing live code, equations, visualizations, and narrative text. It provides an interactive computing environment that supports various programming languages, including Python, R, and Julia.

## Features of Jupyter Notebook

* **Interactive Environment:** Jupyter Notebook provides an interactive environment where you can write and execute code in individual cells. This allows you to experiment, test, and modify code snippets easily.

* **Mix of Code and Markdown:** With Jupyter Notebook, you can include narrative text, equations, and visualizations alongside your code. This makes it a powerful tool for data analysis, research, and storytelling.

* **Real-time Execution:** Each code cell in Jupyter Notebook can be executed independently, and the output is displayed immediately below the cell. This allows you to see the results of your code instantly and iterate on your analysis or calculations.

* **Rich Output:** Jupyter Notebook supports the display of rich media outputs, including plots, images, HTML, LaTeX equations, and interactive widgets. This enables you to create dynamic and visually appealing presentations.

* **Notebook Sharing:** Jupyter Notebook files (with the **.ipynb** extension) can be easily shared and published online. You can share your interactive notebooks with others, making it a collaborative tool for data analysis, teaching, and sharing reproducible research.

## Getting Started with Jupyter Notebook

To start using Jupyter Notebook, you need to install it on your local machine or use an online platform that provides Jupyter Notebook hosting. Here are the basic steps to get started:

* **Install Jupyter Notebook:** Install Jupyter Notebook by following the installation instructions provided in the official documentation. Jupyter Notebook can be installed using Python's package manager, **pip**, or through an Anaconda distribution.

* **Launch Jupyter Notebook:** Once installed, you can launch **Jupyter Notebook** by running the command jupyter notebook in your command-line interface. This will open the Jupyter Notebook interface in your default web browser.

* **Create a New Notebook:** In the Jupyter Notebook interface, you can create a new notebook by clicking on the **"New"** button and selecting the desired programming language (e.g., Python 3). This will open a new notebook with an empty code cell.

* **Execute Code:** Start writing code in the code cell and execute it by pressing **Shift + Enter** or clicking the **"Run"** button. The output of the code will be displayed below the cell.

* **Add Markdown Cells:** To add narrative text, equations, or headings to your notebook, you can create Markdown cells. Markdown cells support plain text as well as Markdown syntax for formatting.

* **Save and Share:** Periodically save your notebook by clicking on the **"Save"** button. You can download your notebook as a **.ipynb** file or share it by publishing it on platforms like GitHub or Jupyter Notebook hosting services.

## Conclusion

Jupyter Notebook is a versatile tool that provides an interactive and collaborative environment for data analysis, research, teaching, and more. Its combination of code execution, narrative text, and visualizations makes it a popular choice among data scientists, researchers, and educators. With Jupyter Notebook, you can explore and present your data in a dynamic and engaging manner.

# INSTALLING ANACONDA

In [None]:
print("Hello World")

# Statistics Revision
## Basics of Statistics

### What is Data

![image.png](attachment:image.png)

### QUALITATIVE DATA


A variable that cannot assume a numerical value but can be classified into two or more nonnumeric categories is called
a qualitative or categorical variable. The data collected on such a variable are called **qualitative** data.

There are two types of Qualitative variables:
1. Nominal Variables The values are not ordered. Example: Nationality, Gender
etc.
2. Ordinal Variables - The values are ordered or ranked. Example: Satisfaction
score (Not satisfied, Satisfied, Delighted), Spiciness of food (Less spicy, mild
& Hot

### QUANTITATIVE DATA

A Variable that can be measured numerically is called a quantitative variable.
The data collected on a quantitative variable are called quantitative data.

There are two types of Quantitative variables:
1. Discrete Variables - A variable whose values are countable is called a discrete
variable. In other words, a discrete variable can assume only certain values
with no intermediate values. Example: Number of heads in 10 tosses etc.
2. Continuous Variables - A variable that can assume any numerical value over
a certain interval or intervals is called a continuous variable. Example: Height
of person etc.

## TYPES OF STATISTICS

![image.png](attachment:image.png)

### DESCRIPTIVE STATISTICS

Descriptive statistics consists of methods for organizing, displaying, and describing data by using tables, graphs, and
summary measures.

* Measures of Central Tendency
    * Mean
    * Median
    * Mode
* Measures of Dispersion
    * Range
    * Standard Deviation
* Frequency Distributions
* Histograms

#### Range
Range is the difference between the largest and the smallest values in a data set

<h3><center>Range = Largest Value  - Smallest Value</center></h3>

For the ages of people attending a party below, what is the range?

|X| | | | | | | | |
|-|-|-|-|-|-|-|-|-|
| 10 | 14 | 26 | 25 | 30 | 34 | 14 | 33 |33|
| 13 | 21 | 25 | 29 | 28 | 7 | 31 | 31 |30|
| 25 | 33 | 31 | 13 | 28 | 33| | | | |

Range = 34 – 7 = 27

> **Note:** Range is influenced by outliers, therefore may not be very useful.

### VARIANCE & STANDARD DEVIATION

**Variance** is the average of the squared differences from the Mean.

**Standard Deviation** is the square root of variance.

$$ σ^2(Variance)  = \frac{∑(X - ϰ)^2}{N} $$
$$ σ(Standard \: Deviation) = \sqrt{σ} $$

from our party attendance data:

 $$Variance = \frac{(10-24.875)^2+(14-24.875)^2…(33 -24.875)^2}{24}  = \frac{1624.625}{24} = 67.69$$

$$ Standard\: Deviation = \sqrt{σ} = 8.23 $$

### Important Python tools

* Numpy
* Pandas
* Seaborn / Matplotlib

#### Numpy