Notebook

Introduction


Some popular programming languages used in data science include:

1.Python – Widely used for data analysis, machine learning, and visualization due to its extensive libraries (NumPy, pandas, scikit-learn, TensorFlow, etc.).
2.R – Popular for statistical computing, data visualization, and analysis (ggplot2, dplyr, caret).
3.SQL – Essential for querying and managing structured databases.
4.Julia – Known for its high performance in numerical computing and data analysis.
5.Scala – Often used with Apache Spark for big data processing.
6.Java – Sometimes used in enterprise data science applications and big data frameworks like Hadoop.
7.C/C++ – Used for performance-intensive tasks in data science and AI applications.
8.MATLAB – Used for numerical computing, data analysis, and visualization in academic and engineering applications.
9.SAS – Commonly used in business analytics and statistical modeling.
10.JavaScript – Useful for data visualization on web applications using libraries like D3.js.

Data science libraries categorized by functionality:

1. Python Libraries
Data Manipulation & Analysis:

pandas – Data manipulation and analysis (DataFrames, CSV processing).
NumPy – Numerical computing, arrays, and matrix operations.
Dask – Parallel computing for large datasets.
Machine Learning & AI:

scikit-learn – Machine learning algorithms (classification, regression, clustering).
TensorFlow – Deep learning framework by Google.
PyTorch – Deep learning library by Facebook.
XGBoost – Gradient boosting algorithms for predictive modeling.
LightGBM – Gradient boosting for large datasets with better speed.
Data Visualization:

Matplotlib – Basic plotting and visualization.
Seaborn – Statistical data visualization based on Matplotlib.
Plotly – Interactive visualizations and dashboards.
Bokeh – Interactive plots for web applications.
Natural Language Processing (NLP):

NLTK – Basic NLP tasks (tokenization, stemming).
spaCy – Advanced NLP processing with deep learning integration.
Gensim – Topic modeling and document similarity.
TextBlob – Simplified text processing.
Big Data & Distributed Computing:

PySpark – Interface for Apache Spark.
Dask – Scalable parallel computing.
Data Collection & Web Scraping:

BeautifulSoup – Web scraping HTML and XML.
Scrapy – Advanced web crawling.
requests – HTTP requests for web data.
Time Series Analysis:

statsmodels – Statistical models and hypothesis testing.
prophet – Facebook's tool for forecasting time series.
2. R Libraries
Data Manipulation & Analysis:

dplyr – Data manipulation.
tidyr – Data tidying and reshaping.
data.table – Fast data manipulation.
Machine Learning:

caret – Machine learning workflow.
randomForest – Random forest implementation.
xgboost – Gradient boosting.
Visualization:

ggplot2 – Data visualization and plotting.
shiny – Web applications and dashboards.
3. SQL and Big Data Tools
Apache Spark – Scalable big data processing.
Hadoop – Distributed storage and processing.
Hive – SQL-based data querying for Hadoop.
Presto – SQL-based querying engine for large datasets.

# Data Science Tools

| Category               | Tools                                    | Description                                          |
|-----------------------|-----------------------------------------|------------------------------------------------------|
| **Programming Languages** | Python, R, SQL, Julia, Scala, Java       | Used for data manipulation, analysis, and modeling.  |
| **Data Manipulation**   | Pandas, NumPy, dplyr, data.table         | Handling and processing structured data.             |
| **Machine Learning**    | Scikit-learn, TensorFlow, PyTorch, XGBoost | Algorithms and frameworks for AI and ML.             |
| **Visualization**       | Matplotlib, Seaborn, ggplot2, Plotly      | Data visualization and plotting tools.               |
| **Big Data Processing** | Apache Spark, Hadoop, Dask               | Handling large-scale data efficiently.               |
| **NLP (Natural Language Processing)** | NLTK, spaCy, Gensim, TextBlob     | Text analysis and language processing.               |
| **Data Collection**     | BeautifulSoup, Scrapy, requests          | Web scraping and data extraction.                    |
| **Time Series Analysis**| Statsmodels, Prophet                     | Forecasting and analyzing time-dependent data.       |
| **Deployment & Automation** | Docker, Kubernetes, MLflow             | Managing and deploying models.                       |
| **Data Storage**        | MySQL, PostgreSQL, MongoDB, HDFS         | Databases for storing and retrieving data.           |


# Arithmetic Expressions in Python

Arithmetic expressions are used to perform mathematical operations such as addition, subtraction, multiplication, and division. Python provides various operators to handle these operations efficiently.

## Common Arithmetic Operators:

| Operator | Description           | Example           | Result |
|----------|-----------------------|-------------------|--------|
| `+`      | Addition               | `5 + 3`            | `8`    |
| `-`      | Subtraction            | `10 - 4`           | `6`    |
| `*`      | Multiplication         | `7 * 2`            | `14`   |
| `/`      | Division               | `15 / 3`           | `5.0`  |
| `//`     | Floor Division         | `17 // 3`          | `5`    |
| `%`      | Modulus (Remainder)     | `17 % 3`           | `2`    |
| `**`     | Exponentiation (Power)  | `2 ** 3`           | `8`    |

## Example Usage in Python:

```python
# Performing basic arithmetic operations
a = 10
b = 3

addition = a + b         # Output: 13
subtraction = a - b      # Output: 7
multiplication = a * b   # Output: 30
division = a / b         # Output: 3.3333
floor_division = a // b  # Output: 3
modulus = a % b          # Output: 1
exponentiation = a ** b  # Output: 1000

print(addition, subtraction, multiplication, division, floor_division, modulus, exponentiation)


In [2]:
# Multiply and add numbers
num1 = 5
num2 = 3

multiplication_result = num1 * num2
addition_result = num1 + num2

print("Multiplication:", multiplication_result)
print("Addition:", addition_result)


Multiplication: 15
Addition: 8


In [3]:
# Convert minutes to hours
def convert_minutes_to_hours(minutes):
    hours = minutes / 60
    return hours

# Example conversion
minutes = 120
print(f"{minutes} minutes is equal to {convert_minutes_to_hours(minutes)} hours")


120 minutes is equal to 2.0 hours


# Objectives

- Understand basic arithmetic operations in Python.
- Learn how to convert time units (minutes to hours).
- Use Markdown to document code effectively.
- Develop problem-solving skills through exercises.
- Gain familiarity with writing and running Python code in Jupyter notebooks.


Author Name:- Ekta Ispande