# My Jupyterlite

## Introduction

As part of this course, throughout this juupyter notebook is where I will be able to demonstrate my knowledge acquired throughout the course.

## Programming Languages ​​Used in Data Sciences

The most used languages ​​in data science are the following:

- **Python:** One of the most popular languages ​​due to its simplicity and a wide range of specific libraries for data science, such as NumPy, Pandas, scikit-learn, TensorFlow, Keras, among others.

- **SQL:** Structured query language used to manage and manipulate relational databases.

- **Java:** Although less common, it is used in some big data applications and frameworks such as Hadoop.

- **C/C++:** Mainly used in cases where algorithm performance needs to be optimized, although it is not so common in data analysis.

- **R:** Used for the field of statistical analysis and data visualization.

## Libraries for Data Science

In data science, there are several specialized libraries that facilitate data analysis, manipulation, and visualization, as well as the construction of machine learning models.

### Data Manipulation and Analysis

- **Pandas (Python):** Essential library for the manipulation and analysis of tabular data. It provides data structures such as DataFrame and tools for cleaning, filtering, and transforming data.

- **NumPy (Python):** Library for working with multidimensional arrays and performing high-performance mathematical operations, such as linear algebra and basic statistics.

- **Dask (Python):** Extends the functionality of Pandas and NumPy for large data sets, allowing parallel processing.

- **dplyr (R):** Package for efficient data manipulation in R, with functions similar to Pandas, such as filtering, grouping, and aggregating data.

- **data.table (R):** High-speed package for data manipulation, especially useful in large volumes.

### Data Visualization

- **Matplotlib (Python):** Standard library for creating simple graphs such as lines, bars, scatter, etc.

- **Seaborn (Python):** Extends Matplotlib to generate more aesthetic and detailed statistical graphs, facilitating the creation of heatmaps, box plots, etc.

- **Plotly (Python and R):** Tool for creating interactive 2D and 3D graphs, useful for web applications and dashboards.

- **ggplot2 (R):** One of the most powerful libraries for visualization in R, based on the grammar of graphics. It allows you to create complex and highly customizable graphs.

- **Bokeh (Python):** Tool for generating interactive and embeddable graphs in web applications.

### Machine Learning and AI

- **scikit-learn (Python):** One of the most popular libraries for machine learning in Python. It offers algorithms for classification, regression, clustering and dimensionality reduction.

- **TensorFlow (Python):** Machine learning and deep learning framework developed by Google. It is widely used to build neural networks and deep learning models.

- **Keras (Python):** High-level API that runs on top of TensorFlow or Theano, designed to facilitate the creation of neural networks.

- **PyTorch (Python):** Developed by Facebook, it is another deep learning framework, popular for its flexibility and ability to do research.

- **XGBoost (Python, R):** Fast and efficient implementation of gradient boosting, an algorithm widely used in machine learning competitions and for structured data.

- **lightgbm (Python, R):** Alternative to XGBoost, designed to be even more efficient and faster in handling large datasets.

- **Caret (R):** Package that simplifies the process of building and evaluating machine learning models in R, integrating various techniques and algorithms.

### Big Data

- **Apache Spark (Python, Scala, R, Java):** Framework for processing large volumes of data in parallel, ideal for machine learning and large-scale data analysis.

- **Hadoop (Java, Python):** Distributed system for storing and processing large data sets.

### Natural Language Processing (NLP)

- **spaCy (Python):** Natural language processing (NLP) library that allows tasks such as grammatical analysis, tokenization, entity recognition, etc.

- **NLTK (Python):** Provides basic tools for working with text, such as lexical and syntactic analysis.

- **Gensim (Python):** Tool for topic modeling and analysis of unstructured text using algorithms such as Word2Vec and LDA.

### Optimization and Evolutionary Algorithms

- **SciPy (Python):** Offers advanced functions for optimization, integration, interpolation, and differential equations, among other areas of applied mathematics.

- **DEAP (Python):** Library for evolutionary algorithms and optimization, such as genetic and differential evolution algorithms.

### Neural Networks and Deep Learning

- **Theano (Python):** Previously widely used for deep learning, it has been replaced by TensorFlow and PyTorch, but is still used in certain scientific applications.

- **MXNet (Python, Scala, R):** Efficient deep learning framework, used by Amazon in its AWS SageMaker machine learning engine.

## Data Science Tools


| **Function**                    | **Tool**                   | **Description**                                                                            |
|----------------------------------|----------------------------|--------------------------------------------------------------------------------------------|
| **Data Manipulation**            | Pandas                     | Library for tabular data manipulation in Python.                                            |
|                                  | NumPy                      | Supports arrays and high-performance mathematical operations in Python.                     |
|                                  | dplyr                      | Data manipulation package in R.                                                            |
|                                  | data.table                 | High-speed data manipulation package in R.                                                 |
| **Data Visualization**           | Matplotlib                 | Library for simple charts in Python.                                                       |
|                                  | Seaborn                    | Extends Matplotlib for more advanced statistical plots in Python.                           |
|                                  | ggplot2                    | Powerful data visualization package in R.                                                  |
|                                  | Plotly                     | Interactive charting library for Python and R.                                              |
|                                  | Bokeh                      | Tool for creating interactive plots in web applications.                                   |
| **Machine Learning**             | scikit-learn               | Library for machine learning models in Python.                                              |
|                                  | TensorFlow                 | Framework for machine learning and deep learning, particularly for neural networks.         |
|                                  | Keras                      | High-level API for building neural networks, runs on TensorFlow.                            |
|                                  | PyTorch                    | Flexible deep learning framework developed by Facebook.                                     |
|                                  | XGBoost                    | Efficient implementation of gradient boosting.                                              |
|                                  | lightgbm                   | Gradient boosting library optimized for large datasets.                                     |
|                                  | Caret                      | Simplifies machine learning model building process in R.                                    |
| **Big Data**                     | Apache Spark               | Framework for parallel processing of large-scale data.                                      |
|                                  | Hadoop                     | Distributed system for storing and processing big data.                                     |
| **Natural Language Processing (NLP)** | spaCy              | Library for natural language processing in Python.                                          |
|                                  | NLTK                       | Basic tools for working with text in Python.                                                |
|                                  | Gensim                     | Tool for topic modeling and text analysis in unstructured data.                             |
| **Optimization & Evolutionary Algorithms** | SciPy           | Advanced functions for optimization and applied mathematics.                               |
|                                  | DEAP                       | Library for evolutionary algorithms and optimization.                                       |
| **Neural Networks & Deep Learning** | Theano             | Framework for deep learning (less commonly used now).                                       |
|                                  | MXNet                      | Efficient deep learning framework used in AWS SageMaker.                                    |


## Arithmetic Operations

Arithmetic expressions are fundamental in programming for performing mathematical operations. Below are some examples of common arithmetic expressions:

### Basic Arithmetic Operations

- **Addition (+)**: Adds two numbers.
  - Example: `5 + 3` results in `8`.

- **Subtraction (-)**: Subtracts the second number from the first.
  - Example: `10 - 4` results in `6`.

- **Multiplication (*)**: Multiplies two numbers.
  - Example: `6 * 7` results in `42`.

- **Division (/)**: Divides the first number by the second.
  - Example: `15 / 3` results in `5.0`.

- **Exponentiation (**)**: Raises the first number to the power of the second.
  - Example: `2 ** 3` results in `8`.

### Modulus and Floor Division

- **Modulus (%)**: Returns the remainder of the division.
  - Example: `10 % 3` results in `1`.

- **Floor Division (//)**: Divides the first number by the second and rounds down to the nearest integer.
  - Example: `10 // 3` results in `3`.

### Combining Arithmetic Operations

Arithmetic expressions can combine multiple operations using parentheses `()` to define precedence.

- **Example with Parentheses**:
  - `(5 + 3) * 2` results in `16`.

In this case, the expression inside the parentheses is evaluated first, then multiplied by 2.

### Order of Operations (PEMDAS)

When no parentheses are used, the order of operations follows **PEMDAS** (Parentheses, Exponents, Multiplication and Division, Addition and Subtraction).

- **Example following PEMDAS**:
  - `3 + 5 * 2 ** 2 - 8 / 4` results in `16`.

Steps:
1. **Exponentiation**: `2 ** 2 = 4`
2. **Multiplication**: `5 * 4 = 20`
3. **Division**: `8 / 4 = 2`
4. **Addition and Subtraction**: `3 + 20 - 2 = 21`

Understanding the order of operations ensures accurate results when combining multiple arithmetic expressions.

## Operations Exercises

### Addition

In [None]:
print("Enter the first value")
number_1 = input()

print("Enter the second value")
number_2 = input()

addition = number_1 + number_2
print(f'Result: {addition}')