Jupyter

Intrduction: Data Science is a multidisciplinary field that utilizes scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data. It combines expertise from various domains, including statistics, computer science, and domain-specific knowledge.

Data science languages:
Python
R
SQL
Scala
Julia

Libraries:
Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn
TensorFlow
PyTorch
ggplot2
dplyr
tidyr
caret

### Data Science Tools

| Category              | Tool                  | Description                                           |
|-----------------------|-----------------------|-------------------------------------------------------|
| **Programming Languages** | Python                | Versatile language with extensive data science libraries |
|                        | R                     | Specialized for statistical analysis and visualization |
|                        | SQL                   | Essential for database querying and management        |
|                        | Julia                 | Known for speed in numerical and scientific computing  |
|                        | Scala                 | Often used with Apache Spark for distributed computing  |
| **Data Manipulation and Analysis** | Pandas            | Python library for data manipulation and analysis      |
|                        | NumPy                 | Numerical computing library for arrays and matrices    |
|                        | dplyr (R)             | R package for data manipulation and transformation    |
|                        | DataFrames.jl (Julia)| Data manipulation in Julia                             |
| **Visualization**     | Matplotlib (Python)   | Plotting and visualization library for Python           |
|                        | ggplot2 (R)           | Data visualization package for R                        |
|                        | Seaborn (Python)      | Statistical data visualization based on Matplotlib     |
| **Machine Learning**   | Scikit-learn (Python) | Machine learning library for classification, regression, clustering, etc. |
|                        | caret (R)             | R package for classification and regression training    |
|                        | TensorFlow (Python)   | Open-source machine learning framework by Google        |
|                        | PyTorch (Python)      | Deep learning library with dynamic computational graphs |
|                        | MLJ (Julia)           | Machine learning framework for Julia                    |
| **Big Data Processing** | Apache Spark (Scala)  | Fast and general-purpose cluster-computing system for big data processing |
| **Database Interaction** | SQLAlchemy (Python)  | SQL toolkit and Object-Relational Mapping (ORM) for Python |
|                        | pandasql (Python)     | Allows running SQL queries on Pandas DataFrames          |
| **Notebook Environments** | Jupyter Notebook     | Interactive notebooks supporting code, text, and visualizations |
|                        | RStudio               | Integrated development environment for R                |
|                        | Google Colab          | Cloud-based Jupyter notebooks by Google                 |
| **Version Control**    | Git                   | Distributed version control system                      |
|                        | GitHub                | Web-based platform for version control and collaboration |
|                        | GitLab                | Web-based platform for Git repositories management      |

These tools cover a wide range of functionalities and are commonly used in various stages of the data science workflow.



# Arithmetic Expressions Examples

Arithmetic expressions involve mathematical operations and are fundamental in programming and data science for numerical computations. Let's explore some examples using common arithmetic operators:

```python
# Addition
result_addition = 5 + 3
print(result_addition)  # Output: 8

# Subtraction
result_subtraction = 10 - 4
print(result_subtraction)  # Output: 6

# Multiplication
result_multiplication = 7 * 2
print(result_multiplication)  # Output: 14

# Division
result_division = 15 / 3
print(result_division)  # Output: 5.0 (Note: In Python 3, division always returns a float)

# Floor Division
result_floor_division = 15 // 3
print(result_floor_division)  # Output: 5 (Floor division discards the fractional part)

# Modulo (Remainder)
result_modulo = 17 % 5
print(result_modulo)  # Output: 2 (Remainder when 17 is divided by 5)

# Exponentiation
result_exponentiation = 2 ** 3
print(result_exponentiation)  # Output: 8 (2 raised to the power of 3)



In [3]:
# Define two numbers
number1 = 10
number2 = 5

# Multiply the numbers
result_multiplication = number1 * number2
print(f"Multiplication result: {result_multiplication}")

# Add the numbers
result_addition = number1 + number2
print(f"Addition result: {result_addition}")


Multiplication result: 50
Addition result: 15


In [4]:
# Define the number of minutes
minutes = 120

# Convert minutes to hours
hours = minutes / 60

# Print the result
print(f"{minutes} minutes is equal to {hours} hours.")


120 minutes is equal to 2.0 hours.


# Objectives

In this learning module, we aim to achieve the following objectives:

1. **Understanding Basic Arithmetic Operations:**
   - Learn and practice fundamental arithmetic operations, including addition, subtraction, multiplication, division, floor division, modulo, and exponentiation.

2. **Applying Arithmetic Operations in Python:**
   - Gain hands-on experience with applying arithmetic operations in Python programming language.
   - Explore how Python handles different types of arithmetic calculations.

3. **Introduction to Data Science Tools:**
   - Familiarize yourself with popular data science programming languages such as Python, R, and Julia.
   - Explore key libraries and tools used in data manipulation, visualization, and machine learning.

4. **Hands-on Coding Exercises:**
   - Engage in coding exercises to reinforce your understanding of arithmetic operations and data science tools.
   - Apply learned concepts through practical examples and exercises.

5. **Applying Arithmetic in Data Science Scenarios:**
   - Understand how arithmetic operations play a crucial role in data science applications.
   - Learn to use arithmetic calculations for data manipulation and analysis.

6. **Introduction to Jupyter Notebooks:**
   - Gain familiarity with Jupyter Notebooks, a popular interactive computing environment for data science.
   - Understand the structure of Jupyter Notebooks and how to execute code cells.

7. **Practical Examples and Applications:**
   - Explore real-world examples and applications where arithmetic operations and data science tools are employed.
   - Apply your knowledge to solve practical problems in data science scenarios.

These objectives aim to provide you with a solid foundation in basic arithmetic operations, programming languages for data science, and practical skills in using relevant tools for data manipulation and analysis.


# Author

This module was prepared by Matej Koval.
