# 📜 IBM Data Science Professional Certificate  
*Curiosity to Capability — One Notebook at a Time*

---

**Compiled and Authored by:**  
**Partho Sarothi Das**  
Dhaka, Bangladesh  
🎓 Bachelor's & Master's in Statistics  
💼 Investment Banking Professional → Aspiring Data Scientist  

>**Disclaimer:** This notebook is based on content from the [IBM Data Science Professional Certificate](https://www.coursera.org/professional-certificates/ibm-data-science) offered on Coursera. It is intended for personal learning and review purposes.

---
---

# API (Application Programming Interface)

## 🔹 What is an API?

API stands for Application Programming Interface. It is a set of rules and protocols that allows different software components to communicate with each other. Think of it as a messenger that takes requests from one system, tells another system what to do, and brings back the response.

### In Simple Terms:

* An **API is like a waiter** in a restaurant.

  * You (the client) tell the waiter what you want.
  * The waiter (API) tells the kitchen (server/system) what to make.
  * The kitchen prepares it and gives it to the waiter.
  * The waiter delivers it back to you.

---

## 🔹 Why are APIs Important?

1. **Code Reusability:** You don’t need to build everything from scratch. Use existing APIs for complex tasks (e.g., Google Maps API for location).  

3. **Modularity:** Systems can be broken down into services that communicate via APIs.

5. **Data Access:** APIs allow access to real-time data (e.g., weather data, financial market data, cryptocurrency data).

7. **Scalability:** APIs make systems easier to scale and manage.   

---

## 🔹 Types of APIs

1. **Library/API within a Program (Local APIs):**

   * Example: Using the pandas API in Python to manipulate DataFrames.
   * You call methods like `.mean()`, `.groupby()`, `.head()`, etc.

2. **Web APIs (Remote APIs or REST APIs):**

   * Allow different systems to communicate over the internet.
   * Example: Getting COVID-19 data from an online source using a REST API.

3. **Operating System APIs:**

   * Interfaces provided by OS like Windows, Linux, Android for developers to use system-level functions (e.g., file operations, memory management).

4. **Hardware APIs:**

   * Communicate with hardware devices such as cameras, sensors, etc.

---

## 🔹 REST API (Most Common Web API Type)

REST stands for **Representational State Transfer**. It’s a set of architectural principles for designing networked applications.

### REST API Key Concepts:

* **Client:** Your app or script that sends the request.  

* **Server (Resource):** The system that provides the service or data.   

* **Endpoint:** URL to which requests are sent (e.g., `https://api.coingecko.com/api/v3/coins/bitcoin`)

* **HTTP Methods:**

  * `GET`: Retrieve data
  * `POST`: Send new data
  * `PUT`: Update data
  * `DELETE`: Delete data

* **Request:** Made by the client to the API endpoint.

* **Response:** Returned by the API, usually in **JSON format**.

Example JSON response:

```json
{
  "bitcoin": {
    "usd": 29350.23,
    "market_cap": 574000000000
  }
}
```

---
## 🔹 Real-Life Use Cases of APIs

| Use Case                | Example API                      |
| ----------------------- | -------------------------------- |
| Weather forecasting     | OpenWeatherMap API               |
| Cryptocurrency price    | CoinGecko, CoinMarketCap API     |
| Social media automation | Twitter API, Facebook Graph API  |
| Machine learning models | Hugging Face Inference API       |
| Stock market data       | Alpha Vantage, Yahoo Finance API |
| Language translation    | Google Translate API             |

---

## 🔹 Key Takeaways

* APIs allow **software systems to communicate** in a standardized way.
* You use **functions/methods and URLs** to send requests and receive responses.
* REST APIs are especially useful for getting **real-time, online data**.
* JSON is the most common data format used in web API communication.
* Python’s `requests`, `http.client`, and libraries like `PyCoinGecko` make API usage simple.

---

In [8]:
# Using an API in Python (Simple Workflow)
import requests

response = requests.get("https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&vs_currencies=usd")
data = response.json()
print(data["bitcoin"]["usd"])

115163


# HTTP Protocol (HyperText Transfer Protocol)

### 🔹 Key Concepts Discussed:

**1. URL (Uniform Resource Locator)**

A URL is used to locate web resources and consists of three parts:

* **Scheme**: Protocol (e.g., `http://`)
* **Base URL**: Server address (e.g., `www.ibm.com`)
* **Route/Path**: Specific location on the server (e.g., `/images/logo.png`)

---

**2. HTTP Request & Response Process**

* When a client (like a browser) requests a resource (e.g., `index.html`), it sends an **HTTP request** to the server.
* The server processes the request and sends back an **HTTP response**, which contains:

  * **Start Line**: Status code (e.g., `200 OK`)
  * **Response Header**: Metadata (e.g., file type, length)
  * **Response Body**: The actual requested content (e.g., HTML file, image)

---

**3. HTTP Methods**

* **GET**: Requests data from the server (e.g., a web page or file).
* **POST**: Sends data to the server.
* Other methods include PUT, DELETE, etc.

---

**4. HTTP Status Codes**

These indicate the result of the request:

* **100–199**: Informational (e.g., 100 – Continue)
* **200–299**: Success (e.g., 200 – OK)
* **400–499**: Client Errors (e.g., 401 – Unauthorized)
* **500–599**: Server Errors (e.g., 501 – Not Implemented)

---

# HTTP Protocol with Python's `Requests` Library

### 🔹 What is the Requests Library?

* `requests` is a user-friendly Python library for sending **HTTP/1.1 requests**.
* It simplifies interaction with web services by handling common tasks like URL encoding, HTTP headers, and response parsing.

---

### 🔹 GET Requests

* A **GET request** is used to **retrieve data** from a server.
* Example:

  ```python
  import requests
  r = requests.get("http://www.ibm.com")
  ```
* Key components of the `response` object:

  * `status_code`: Shows HTTP status (e.g., `200` for success)
  * `headers`: Contains metadata like `Date` and `Content-Type`
  * `encoding`: Shows character encoding (e.g., UTF-8)
  * `text`: Contains the response body (e.g., HTML content)
  * `json()`: Parses response text into a Python dictionary (if response is JSON)

#### Query Strings with GET:

* You can send parameters in the URL via a **query string** using a dictionary:

  ```python
  payload = {"name": "Joseph", "ID": "123"}
  r = requests.get("https://httpbin.org/get", params=payload)
  ```
* The query string appears in the URL as `?name=Joseph&ID=123`.

---

### 🔹 POST Requests

* A **POST request** is used to **send data to a server**, especially when modifying or uploading content.
* Unlike GET, data is sent in the **request body**, not in the URL.

  ```python
  payload = {"name": "Joseph", "ID": "123"}
  r = requests.post("https://httpbin.org/post", data=payload)
  ```
* The `form` field in the JSON response shows the submitted data.
* The `url` attribute of the response does not contain the query string for POST requests.

---

### 🔹 GET vs. POST Comparison

| Feature              | GET                   | POST            |
| -------------------- | --------------------- | --------------- |
| Data Location        | In URL (query string) | In request body |
| URL Contains Params? | Yes                   | No              |
| Use Case             | Retrieving data       | Sending data    |
| Has Request Body?    | No                    | Yes             |

---

# Module 5 Summary: APIs and Data Collection

Congratulations! You have completed this module. At this point, you know that: 

Simple APIs in Python are application programming interfaces that provide straightforward and easy-to-use methods for interacting with services, libraries, or data, often with minimal configuration or complexity.

An API lets two pieces of software talk to each other.

Using an API library in Python entails importing the library, calling its functions or methods to make HTTP requests, and parsing the responses to access data or services provided by the API.

Pandas API processes the data by communicating with the other software components.

An Instance forms when you create a dictionary and then use the DataFrames constructor to create a Pandas object. 

Method “head()” will display the mentioned number of rows from the top (default 5) of DataFrames, while method “mean()” will calculate the mean and return the values

Rest APIs allow you to communicate through the internet, taking advantage of resources like storage, access more data, AI algorithms, and so on.

HTTP methods transmit data over the internet.

An HTTP message typically includes a JSON file with instructions for operations.

HTTP messages containing JSON files are returned to the client as a response from web services.

Dealing with time series data involves using the Pandas time series function. 

You can get data for daily candlesticks and plot the chart using Plotly with the candlestick plot. 

The HTTP (HyperText Transfer Protocol) transfers data, including web pages and resources, between a client (a web browser) and a server on the World Wide Web.

The HTTP protocol is commonly used for implementing various types of REST APIs.

An HTTP response includes information like the type of resource, length of resource, and so on

Uniform resource locator (URL) is the most popular way to find resources on the web.

URL is divided into three parts: scheme, internet address or base URL, and route

The GET method is one of the popular methods of requesting information. Some other methods may also include the body.

Response method contains the version and body of the response.

POST submits data to the server, PUT updates data already on the server, DELETE deletes data from the server

Requests is a Python library that allows you to send HTTP/1.1 requests easily

You can modify the results of your query with the GET method.

You can obtain multiple requests from a URL like name, ID, and so on with a Query string.

Web scraping in Python involves extracting and parsing data from websites to gather information for various applications, using libraries like Beautiful Soup and requests.

HTML comprises text surrounded by blue text elements enclosed in angular brackets called tags.

You can select an HTML element on a web page to inspect the webpage.

Web pages may also contain CSS and JavaScript along with HTML elements.

Each HTML document is like an HTML Tree, which may contain strings and other tags.

Each HTML table is comprised of table tags and is structured with elements such as rows, headers, body and so on.

Tabular data can also be extracted from web pages using the `read_html` method in Pandas.

Beautiful Soup in Python is a library for parsing and navigating HTML and XML documents, making extracting, and manipulating data from web pages more accessible.

To parse a document, pass it through the Beautiful Soup constructor to get a beautiful soup object representing the document as a nested data structure.

Beautiful soup represents HTML as a set of tree-like objects with methods to parse the HTML.

Navigable string is like a Python string that supports beautiful soup functionality.

find_all is a method used to extract content based on the tag’s name, its attributes, the text of a string, or some combination of these.

The find_all method looks through a tag’s descendants and retrieves all descendants that match your filters.

The result is a Python iterable like a list.

File formats refer to the specific structure and encoding rules used to store and represent data in files, such as .txt for plain text or .csv for comma-separated values.

Python works with different file formats such as CSV, XML, JSON, xlsx, and so on

The extension of a file name will let you know what type of file it is and what it needs to open with.

To access data from CSV files, we can use Python libraries such as Pandas.

Similarly, different methods help parse JSON, XML, and other files.