# JupiterAssignmentIBM

### Introduction

Welcome to the introductory assignment for the IBM Data Science Professional Certification! In this assignment, you'll embark on a journey to explore the foundational concepts and tools in the field of data science.

Throughout this notebook, you'll have the opportunity to demonstrate your understanding of programming languages, libraries, and tools commonly used in data science. Additionally, you'll apply basic arithmetic operations to solve simple mathematical problems.

Your grade for this assignment will be based on the completion of various exercises, each contributing to a total of 25 points. Please ensure that you carefully follow the instructions for each exercise and submit your completed notebook via GitHub.

Let's dive in and get started!

Data Science Languages

1. Python
2. R
3. SQL (Structured Query Language)
4. Java
5. Scala
6. Julia
7. MATLAB
8. SAS (Statistical Analysis System)
9. JavaScript (for web-based data visualization)
10. C/C++

Data Science Libraries

1. NumPy: Fundamental package for numerical computing in Python.
2. Pandas: Data manipulation and analysis library, offering data structures like DataFrame.
3. Matplotlib: Comprehensive library for creating static, animated, and interactive visualizations in Python.
4. Seaborn: Statistical data visualization library based on Matplotlib, providing a high-level interface for drawing attractive and 5. informative statistical graphics.
6. Scikit-learn: Machine learning library that supports various supervised and unsupervised learning algorithms.
7. TensorFlow: Open-source machine learning framework developed by Google for large-scale machine learning applications.
8. PyTorch: Deep learning framework that facilitates building and training neural networks.
9. Keras: High-level neural networks API, running on top of TensorFlow or Theano.
10. StatsModels: Statistical modeling and hypothesis testing library in Python.
11. NLTK (Natural Language Toolkit): Library for natural language processing (NLP) tasks such as tokenization, stemming, tagging,      parsing, and more.

Data Science Tools


| Category          | Tool                               | Description                                                                                                 |
|-------------------|------------------------------------|-------------------------------------------------------------------------------------------------------------|
| Programming       | Python                             | General-purpose programming language with extensive libraries for data manipulation, analysis, and ML.      |
|                   | R                                  | Statistical programming language widely used for data analysis, statistical modeling, and visualization.     |
|                   | SQL                                | Standard language for managing and querying relational databases.                                            |
| Data Manipulation | Pandas                             | Python library for data manipulation and analysis, providing DataFrame data structures and tools.           |
|                   | NumPy                              | Fundamental package for scientific computing in Python, providing support for large, multi-dimensional arrays.|
| Data Visualization| Matplotlib                         | Comprehensive plotting library for creating static, animated, and interactive visualizations in Python.      |
|                   | Seaborn                            | Statistical data visualization library based on Matplotlib, offering high-level interface for attractive plots.|
|                   | Plotly                             | Open-source graphing library for creating interactive plots and dashboards.                                   |
| Machine Learning | Scikit-learn                       | Machine learning library in Python, providing simple and efficient tools for data mining and analysis.       |
|                   | TensorFlow                         | Open-source machine learning framework developed by Google for large-scale ML applications.                  |
|                   | PyTorch                            | Deep learning framework that facilitates building and training neural networks.                              |
|                   | XGBoost                            | Scalable and efficient implementation of gradient boosting algorithms.                                       |
|                   | LightGBM                           | Gradient boosting framework developed by Microsoft with high performance.                                     |
| Natural Language  | NLTK (Natural Language Toolkit)    | Library for NLP tasks such as tokenization, stemming, tagging, parsing, and more.                             |
| Processing        | SpaCy                              | NLP library for advanced NLP tasks such as named entity recognition, part-of-speech tagging, etc.            |
|                   | Gensim                             | Library for topic modeling and document similarity analysis.                                                  |
| Data Wrangling   | OpenRefine                        | Open-source tool for cleaning and transforming messy data.                                                    |
|                   | Trifacta Wrangler                 | Data wrangling tool for exploring, cleaning, and preparing data for analysis.                                  |
| Version Control  | Git                               | Distributed version control system for tracking changes in source code during software development.         |
|                   | GitHub                            | Web-based platform for hosting and sharing Git repositories, often used for collaborative development.        |
| Cloud Computing | Amazon Web Services (AWS)         | Cloud computing platform offering various services for storage, computing, analytics, and more.               |
|                   | Google Cloud Platform (GCP)       | Suite of cloud computing services by Google, providing infrastructure, data analytics, ML, and more.          |
|                   | Microsoft Azure                   | Cloud computing platform by Microsoft, offering services for computing, analytics, storage, and ML.         |

This table provides a brief overview of some of the key tools across different categories in the Data Science ecosystem.

## Arithmetic Expression Examples

In this section, we will explore some basic arithmetic operations using Python. Arithmetic operations are fundamental in data science for various calculations and transformations of data.

### Addition

```python
# Example of addition
num1 = 10
num2 = 20
sum_result = num1 + num2
print("The sum of", num1, "and", num2, "is:", sum_result
      
# Example of subtraction
num1 = 50
num2 = 30
difference = num1 - num2
print("The difference between", num1, "and", num2, "is:", difference)

# Example of multiplication
num1 = 8
num2 = 5
product = num1 * num2
print("The product of", num1, "and", num2, "is:", product)

# Example of division
num1 = 100
num2 = 4
quotient = num1 / num2
print("The quotient of", num1, "divided by", num2, "is:", quotient)

# Example of exponentiation
base = 2
exponent = 3
result = base ** exponent
print("The result of", base, "raised to the power of", exponent, "is:", result)
  
      

In [3]:
# Multiplication
num1 = 5
num2 = 8
product = num1 * num2
print("The product of", num1, "and", num2, "is:", product)

# Addition
num3 = 10
num4 = 15
sum_result = num3 + num4
print("The sum of", num3, "and", num4, "is:", sum_result)


The product of 5 and 8 is: 40
The sum of 10 and 15 is: 25


In [5]:
# Conversion from minutes to hours
minutes = int(input())
hours = minutes / 60
print(minutes, "minutes is equal to", hours, "hours.")


400
400 minutes is equal to 6.666666666666667 hours.


## Objectives

1. Gain familiarity with fundamental concepts and tools in data science.
2. Explore programming languages commonly used in data science, such as Python and R.
3. Understand the role of data science libraries, including Pandas, NumPy, and Matplotlib.
4. Identify key data science tools for tasks like data manipulation, visualization, and machine learning.
5. Practice basic arithmetic operations using Python for numerical computations.
6. Convert units of measurement, such as minutes to hours, to solve real-world problems efficiently.


### Author

Aswin P
