<a href="https://colab.research.google.com/github/mounikapentukar-alt/Projects/blob/main/Working_with_APIs_(REST_%26_JSON)_%E2%80%93_Data_Extraction_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Working with APIs (REST & JSON) – Data Extraction Project

Overview

This project demonstrates my understanding of working with APIs (Application Programming Interfaces) to extract real-time data from online platforms using REST architecture and JSON data formats.

What I Understand About APIs??
What is an API?

An API (Application Programming Interface) allows applications to communicate with each other. In data analytics, APIs are commonly used to pull live data from platforms like:

Financial markets

Healthcare systems

Weather services

Social media platforms

E-commerce platforms

**REST APIs (Representational State Transfer)**

I understand how REST APIs work based on:

HTTP Methods


GET → Retrieve data

POST → Send/Create data

PUT → Update data

DELETE → Remove data

**Endpoints**

Structured URLs used to access specific resources

Example:

https://api.example.com/v1/users


**Status Codes**
A status code is a three-digit number sent by a web server to a client (such as a web browser or API application) in response to an HTTP request.

200 → Success

201 → Created

400 → Bad Request

401 → Unauthorized

404 → Not Found

500 → Server Error



In [None]:
#Most REST APIs return data in JSON format.
 {
  "user_id": 101,
  "name": "John Doe",
  "age": 29,
  "city": "New York"
}


In [None]:
#Ways to assign JSON data to a variable in Python
import json

json_string = '{"id": 1, "name": "Alice", "age": 25}'

data = json.loads(json_string) #json.loads() converts JSON string → Python dictionary


print(data)
print(type(data))

{'id': 1, 'name': 'Alice', 'age': 25}
<class 'dict'>


In [None]:
#If the JSON response contains a list of records, it can be converted directly into a DataFrame.

import pandas as pd

json_string = '[{"id": 1, "name": "Alice", "age": 25},{"id": 1, "name": "Alice", "age": 25}]'

data = json.loads(json_string)   #json.loads() converts JSON string → Python dictionary


df = pd.DataFrame(data)
print(df.head())

   id   name  age
0   1  Alice   25
1   1  Alice   25


In [None]:
#From a Local JSON File
import json

with open("/content/sample_data/data.json", "r") as file:
    data = json.load(file)      #json.load() reads JSON file → Python object

print(data)

[{'id': 1, 'name': 'Alice', 'address': {'city': 'New York', 'zip': '10001'}}]


In [None]:
{
  "user": {
    "id": 101,
    "profile": {
        "name": "John",
        "email": "john@email.com"
    }
  }
}
To extract specific fields:


user_id = data["user"]["id"]
email = data["user"]["profile"]["email"]

In [None]:
#APIs often return nested JSON objects. These must be flattened before analysis.Use json_normalize()

from pandas import json_normalize


json_string = '{"id": 1,"name": "Alice","address": {"city": "New York","zip": "10001" }}'

data = json.loads(json_string)  #json.loads() converts JSON string → Python dictionary
df = json_normalize(data)
print(df.head())

   id   name address.city address.zip
0   1  Alice     New York       10001


Handle Missing Values

APIs may return incomplete data. Handle missing values using:



df.isnull().sum()          # Check missing values

df.fillna("Unknown")       # Replace missing values

df.dropna()                # Remove rows with missing values