# Intoduction to Data Science 

## <span style="color:#2E86C1">1.1 What is Data Science?</span>

- **<span style="color:#D35400">Definition:</span>**  
  Data Science is a field that uses scientific methods, algorithms, and systems to extract knowledge and insights from data. In simpler words, it’s about making sense of large amounts of data to find useful information and make better decisions.

- **<span style="color:#D35400">Scope:</span>**  
  Data Science is a broad field that combines different areas:
  - <span style="color:#28B463"><b>Statistics</b></span>: Understanding how to collect, analyze, and interpret data.
  - <span style="color:#28B463"><b>Programming</b></span>: Using code (like Python) to process data.
  - <span style="color:#28B463"><b>Machine Learning</b></span>: Making machines (computers) learn from data to make predictions or decisions.
  - <span style="color:#28B463"><b>Data Visualization</b></span>: Presenting data in charts or graphs to make it easier to understand.

---

### <span style="color:#2E86C1">Why is Data Science Important?</span>

Data is everywhere today! Think about the huge amount of information on social media, in hospitals, or online stores.

- **<span style="color:#D35400">Data Science helps:</span>**
  - Businesses understand customer behavior (e.g., what products are popular).
  - Healthcare professionals predict diseases or recommend treatments.
  - Technology companies build better products (like recommendation systems on Netflix or YouTube).

Because we have more data than ever before, using Data Science to make sense of it is extremely important. Companies need it to stay competitive, governments use it to make better policies, and even individuals can use it to make decisions.

---


# Anaconda 

<span style="color:#2E86C1">**What is Anaconda?**</span>

Anaconda is a popular open-source distribution of Python and R, designed specifically for scientific computing and data science. It simplifies package management and deployment, making it easier to work with data science libraries and tools. Anaconda includes:

- <span style="color:#28B463">**Conda Package Manager**:</span>A tool for managing packages and environments, allowing users to easily install, update, and remove libraries.
- <span style="color:#28B463">**Integrated Development Environment (IDE)**:</span>Tools like Jupyter Notebook and Spyder are included, providing environments to write and execute code interactively.
- <span style="color:#28B463">**Pre-installed Libraries**:</span> Anaconda comes with many popular libraries for data science, including NumPy, pandas, Matplotlib, and SciPy, which save time on installation.

<span style="color:#2E86C1">**Purpose of Anaconda**</span>

The purpose of Anaconda is to streamline the process of managing libraries and dependencies for data science projects. With Anaconda, users can:

- Create isolated environments for different projects to avoid version conflicts.
- Easily install and manage packages with a simple command.
- Utilize Jupyter Notebook for interactive coding and data visualization.

<span style="color:#2E86C1">**How to Install Anaconda**</span>

Follow these steps to install Anaconda on your computer:

1. **Download the Installer**:
   - Go to the [Anaconda Distribution page](https://www.anaconda.com/products/distribution).
   - Choose the version compatible with your operating system (Windows, macOS, or Linux) and click the download button.

2. **Run the Installer**:
   - Locate the downloaded installer and run it.
   - Follow the prompts in the installation wizard:
     - Accept the license agreement.
     - Choose whether to install for "Just Me" or "All Users."
     - Select the installation location (the default is usually fine).

3. **Advanced Options** (Optional):
   - During the installation, you may see options to add Anaconda to your system PATH variable. It's generally recommended to leave this unchecked to avoid potential conflicts.

4. **Complete Installation**:
   - Click "Install" and wait for the installation to complete.
   - Once finished, you can launch Anaconda Navigator or use the Anaconda Prompt to start working with Python and R.

5. **Verify Installation**:
   - Open Anaconda Prompt (Windows) or Terminal (macOS/Linux) and type the following command to check if Anaconda is installed correctly:
     ```bash
     conda --version
     ```
   - If Anaconda is installed, this command will return the version number of Conda.



## [Click here to Watch Installation video on YouTube](https://www.youtube.com/watch?v=oHHbsMfyNR4)

#### Click on windows at bottom left and type anaconda/miniconda and you can either open the terminal or navigator 
-   <span style="color:#28B463">Anaconda Prompt :</span> command line interface 
-   <span style="color:#28B463">Anaconda Navigator :</span> GUI based interface 


<img src="../../images/anaconda_start_terminal.png" alt="Data Science Lifecycle" width="600"/>

### Once in prompt do following : 

```bash 
conda info  # To check conda installation status and version info 
conda env list # to check the list of envs in conda 
conda create --name env_name ## will create a conda env 
conda create --name env_name python==version ## will install conda with particular python version 
```

**<span style="color:#D35400">Note:</span>** If you want to try any conda command but having problem you can add 
```bash 
COMMANDLINE --help 
```
after any half written conda command to get help

<img src="../../images/conda_comms_1.png" alt="Data Science Lifecycle" width="1000"/>

```bash 
conda activate env_name # to activate 
conda list # once inside env this command will show all installed packages in that conda env
conda install package_name # will install package in that conda env 
conda install --name env_name package_name # will install package in that particular conda env 
```
**<span style="color:#D35400">Note:</span>** By default when you open anaconda prompt shell you're in ' base ' env already after that you can create new envs and navigate through them

<img src="../../images/conda_comms_2.png" alt="Data Science Lifecycle" width="600"/>

```bash 
conda uninstall package_name # this will uninstall the package from current env 
conda install --name env_name package_name # will install package in that particular conda env spacified after '--name'
```

<img src="../../images/conda_comms_3.png" alt="Data Science Lifecycle" width="600"/>


```bash 
conda search package_name # will give diff version of packages available from conda to install 
```
<img src="../../images/conda_comms_4.png" alt="Data Science Lifecycle" width="600"/>

```bash 
conda install package_name=version ## will install pacakage with specific version if requirements are met 
```

**<span style="color:#D35400">Note:</span>** Make sure all dependencies are met when you are installing specific version of any package in conda env 

<img src="../../images/conda_comms_5.png" alt="Data Science Lifecycle" width="600"/>

```bash 
conda update package_name ## will update installed package to latest version 
conda install --channel channel_name package_name # will install package from that specific channel repo
```

<img src="../../images/conda_comms_6.png" alt="Data Science Lifecycle" width="600"/>

```bash
conda deactivate # will deactivate that current env ( no need env name it just exit from current env to base env )
conda remove --name env_name package_name # remove specific package from that env 
conda remove --name env_name --all # '--all' command will delete the whole conda env 
```
<img src="../../images/conda_comms_7.png" alt="Data Science Lifecycle" width="600"/>

- For any further assistance visit : [Click here](https://docs.anaconda.com/)
- Anaconda Command Cheat Sheet : [Click here](https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf)

# Jupyter-Notebook

<span style="color:#2E86C1">What is Jupyter Notebook?</span>
Jupyter Notebook is an interactive web-based tool that allows you to write and run code in a flexible, user-friendly environment. It’s commonly used for data analysis, machine learning, and scientific computing. You can mix code, visualizations, and text in one document, making it easy to share and present your work.

**<span style="color:#D35400">Key Features:</span>**
- <span style="color:#28B463">Code Execution:</span> Run code snippets in languages like Python, R, and Julia.
- <span style="color:#28B463">Markdown Support:</span> Write formatted text, including headings, lists, and links.
- <span style="color:#28B463">Visualizations:</span> Create graphs and charts to visualize data directly in the notebook.
- <span style="color:#28B463">Interactive Widgets:</span> Add sliders and buttons to make your notebooks more interactive.

---

**<span style="color:#D35400">How to Run Jupyter Notebook:</span>**

1. Open the Anaconda Prompt.
2. Create a new environment or activate an existing one:
   - To create a new environment, use the command:
     ```bash
     conda create --name myenv python=version
     ```
     (Replace `myenv` with your preferred environment name and adjust the Python `version` as needed.)
   - To activate an existing environment, use:
     ```bash
     conda activate myenv
     ```
   - Install jupyter Notebook or if installed run it:
     ```bash
     conda install jupyter 
     ```
   - Run jupyter Notebook:
     ```bash
     jupyter-notebook 
     ```

<img src="../../images/jupyter_comms_1.png" alt="jupyter_comms_1" width="800"/>

#### Once you're inside jupyter-notebook you can create a new notebook by clicking on `New` and `Selecting Kernal` :

<img src="../../images/jupyter_comms_2.png" alt="jupyter_comms_2" width="800"/>

---
<center><h2>Good We're all set and move with actual Data Science portion<h2></center> 
<center>========----========</center>