<div style="display: flex; align-items: center;">
    <img style="width: 150px; margin-right: 10px;" src="https://upload.wikimedia.org/wikipedia/fr/thumb/e/e9/EPF_logo_2021.png/524px-EPF_logo_2021.png" alt="EPF Logo">
    <div style="text-align: center; flex: 1;">
        <h1 style="margin: 0;">Data Sources</h1>
        <p><strong>P2025: ML engineering</strong></p>
    </div>
</div>

<h2>TP 1 : API Knowledge</h2>

<h3>Name : MOHAMOUD ROBLEH Anes</h3>
<h3>Group: DEML</h3>


# General Knowledge of API

APIs, or Application Programming Interfaces, play a pivotal role in modern software development by facilitating communication and data exchange between different systems. They serve as bridges that allow applications to interact with each other seamlessly, enabling the creation of more robust and interconnected software.

APIs come in various forms, each serving specific purposes in the realm of software development. Let's explore some fundamental concepts:

- **Question 1:** *Name three types of API protocols. Briefly explain the primary use of each.*

  - **REST**: It's a common way for systems to talk to each other using URLs and HTTP (like how browsers work). Great for web apps.

  - **SOAP**: It's older and more strict. It uses XML and is often used in big companies for secure data sharing.

  - **GraphQL**: Lets you ask for just the data you need. Great for apps with lots of connected data (like social media META ).

  


- **Question 2:** *What are the HTTP response code families? And what do they mean?*

  - **1xx(Informational)**: The server is thinking and still working on your request.

  - **22xx(Success)**: Yay! Your request worked.

  - **3xx(Redirection)**: The thing you asked for is somewhere else.

  - **4xx(Client Error)**: You made a mistake like asking for something that doesn’t exist.

  - **5xx(Server Error)**: Try again later.

  Understanding these families helps developers diagnose and troubleshoot issues during API interactions.

- **Question 3:** *What do the HTTP response codes 201, 401, and 404 mean?*

  - **201(Created):** Something new was created successfully like adding a new user.
  - **401(Unauthorized):** You’re not logged in or don’t have permission to see this.


  - **404(Not Found):** The thing you’re looking for isn’t there.

- **Question 4:** *Name the 4 basic HTTP verbs.*

  - **GET:** Retrieve data from a resource.
  - **POST:** Submit data to create a resource.
  - **PUT:** Update or replace a resource entirely.
  - **DELETE:** Remove a resource.

- **Question 5:** *Explain the difference between PUT and PATCH?*

  - **PUT:** Updates an entire resource. Example: Replace the whole user profile.

  - **PATCH:** Partially updates a resource. Example: Change only the user’s email.


- **Question 6:** *Name at least two data formats commonly used in API exchanges.*

  - **JSON:** Easy-to-read format that looks like a list or dictionary.


  - **XML:** More detailed but harder to read. Used in older systems.


- **Question 7:** *How can you verify the validity of a resource without getting the entire response?*

  - Use **HEAD**: It only fetches the "header" info like file size or last update (metadata) without getting the full data.

- **Question 8:** *What are the main concepts of REST? (name them)*

  - **Client-Server:** The client (you) and server are separate.
  - **Stateless:** No saving info about you between requests.
  - **Cacheable:** Save responses so you don’t need to ask again.
  - **Uniform Interface:** Everyone follows the same rules.


- **Question 9:** *Can you explain one of the main concepts of your choice from among those you mention? (Give an example if possible)*

  - **Statelessness:** The server doesn’t remember who you are.
**Example:** Every time you ask for something, you send your info like an API key so the server knows what to do.

In the subsequent sections, we will delve into practical exercises to apply and deepen our understanding of these concepts using SOAP, REST, and GraphQL APIs.


--------------------------

# Exploring SOAP APIs

### Few elements to remember about the SOAP Protocol

The SOAP protocol, which means Simple Object Access Protocol, is one of the earliest web service protocols. SOAP is an XML-based protocol and was designed to provide a platform/language-independent way to exchange data between different systems over the internet.

### Key Concepts in SOAP:

- **XML-Based Structure:** SOAP messages are structured using XML, making them both human-readable and machine-readable. This structure allows for the encapsulation of data and its transport between systems.

- **Platform and Language Independence:** One of the core objectives of SOAP is to provide a communication method that is independent of the underlying platform or programming language. This promotes interoperability between diverse systems.

- **Message Format:** SOAP messages consist of an envelope that defines the message structure and rules for processing, a set of encoding rules for data types, and conventions for representing remote procedure calls.

- **Transport Neutrality:** SOAP can be used with various transport protocols, including HTTP, SMTP, and more. This flexibility in transport makes it adaptable to different network environments.

### Objective

Obtain and display the capital of the Canada corresponding to the ISO code "CA" using the following SOAP API. 
Step by step guide :

- **Step 1:** Examine the XML structure of the SOAP request provided. Identify the tag name that contains the ISO country code and the tag that will return the capital name.

- **Step 2:** Modify the existing SOAP request to use the ISO code "CA" isntead of "FR". Ensure that the XML structure remains correct.

- **Step 3:** Use the modified request to send a request to the SOAP services at the specified URL.

- **Step 4:** Analyze the response received. Extract and display the capital name from the SOAP response.

- **Step 5:** Remove sections of code that are not necessary to achieve this objective, in order to simply the script.


### Documentation link :

- https://www.postman.com/cs-demo/workspace/postman-customer-org-s-public-workspace/documentation/8854915-43f6a9be-0c65-4486-bfdf-36b6548161dd?entity=request-96a53688-6305-45be-ab8b-ca1d1c88f830
- https://docs.insomnia.rest/

In [2]:
import requests
import xml.etree.ElementTree as ET


# SOAP request URL
url = "http://webservices.oorsprong.org/websamples.countryinfo/CountryInfoService.wso"

# structured XML
payload = """<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
                <soap:Body>
                    <CapitalCity xmlns="http://www.oorsprong.org/websamples.countryinfo">
                        <sCountryISOCode>CA</sCountryISOCode>
                    </CapitalCity>
                </soap:Body>
                </soap:Envelope>"""
# headers
headers = {
    'Content-Type': 'text/xml; charset=utf-8'
}
# POST request
response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <m:CapitalCityResponse xmlns:m="http://www.oorsprong.org/websamples.countryinfo">
      <m:CapitalCityResult>Ottawa</m:CapitalCityResult>
    </m:CapitalCityResponse>
  </soap:Body>
</soap:Envelope>


**The script that only outputs "Ottawa" as response with a proper docstring**

In [35]:
import requests
import xml.etree.ElementTree as ET

def fetch_capital(country_code):
    """
    Fetch the capital city of a given country using a SOAP API.

    Args:
        country_code (str): The ISO country code of the country (exemple: "CA" for Canada).

    Returns:
        str: The name of the capital city if found, otherwise an error message.
    """
    # SOAP request URL
    url = "http://webservices.oorsprong.org/websamples.countryinfo/CountryInfoService.wso"

    # SOAP XML payload
    payload = f"""<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
                    <soap:Body>
                        <CapitalCity xmlns="http://www.oorsprong.org/websamples.countryinfo">
                            <sCountryISOCode>{country_code}</sCountryISOCode>
                        </CapitalCity>
                    </soap:Body>
                 </soap:Envelope>"""

    # Headers
    headers = {'Content-Type': 'text/xml; charset=utf-8'}

    # Make the POST request
    response = requests.post(url, headers=headers, data=payload)

    root = ET.fromstring(response.text)
    namespaces = {'m': 'http://www.oorsprong.org/websamples.countryinfo'}
    capital = root.find('.//m:CapitalCityResult', namespaces)
    return capital.text if capital is not None else None

# Fetch and print the capital of Canada
print(fetch_capital("CA"))

Ottawa


--------------------------

# REST API Exercise: Star Wars Information Retrieval

### Introduction 

In the exercice, you will explore the Star Wars API (SWAPI) to retrieve and analyze data related to Star Wars characters, films and planets. The SWAPI API is a RESTful web service that provideinformation about Star Wars universe, accessible through various endpoints.\
This exercice is designed to enhance your understanding of working with RESTful APIs, feel free to ask me if you have any question. Each task will build on the previous one so don't hesitate if you are blocked. Make sure to handle bad response code.

### Few elements to remember about the REST Protocol

REST (Representational State Transfer) is an architectural style for designing networked applications. RESTful APIs (Application Programming Interfaces) conform to the principles of REST, allowing systems to communicate over HTTP in a stateless manner; Some important aspects are:

- **Resources:** Everything is a resource, identified by a unique URI.

- **HTTP Methods:** CRUD operations are performed using standard HTTP methods (GET, POST, PUT, DELETE).

- **Stateless:** Each request from a client contains all the information needed to understand and fulfill the request.

### Key Concepts in REST:

- **Endpoint:** A specific URI representing a resource. Endpoints are URLs that define where resources can be accessed.

- **Basic HTTP Methods:** One of the core objectives of SOAP is to provide a communication method that is independent of the underlying platform or programming language. This promotes interoperability between diverse systems.
    - **GET:** Retrieve data from a specified resource.
    - **POST:** Submit data to be processed to a specified resource.
    - **PUT:** Update a resource.
    - **DELETE:** Delete a resource.

- **Request and Response:**
    - **Request:** The client's message to the server, including the HTTP method, headers, and optional data.
    - **Response:** The server's reply to the client's request, containing status information and, optionally, data.


### Objective

- **Step 1: Introduction:** Find some informations about the SWAPI API : the base URL, the Rate limiting and How to auhtenticate. Find information on all available resources withing this API with a request.

- **Step 2: Retrieve Character Information:** Retrieve all characters informations (name, gender, height, ...).

- **Step 3: Retrieve Film Information:** Retrieve all films informations (title, director, release date, ...).

- **Step 4: Retrieve Planet Information:** Retrieve all planets informations (name, population, climate, ...).

- **Step 5: Search and Display:** Create a function to search for and display information about a specific character based on its name. Be sure to handle cases of bad queries and to make at least three unittests with an understandable name.

- **Step 6: Advanced Query:** Store in a pandas dataframe all informations about all the characters of the film you want. Group the characters by species at the end.

- **Step 7: Data Analysis:** Create an advanced query to retrieve information on all the films, and find a way to rank them according to the number of characters in the film.  

- **Step 8 bonus: Additional Endpoint:** Explore an additional endpoint and make a request to display relevant information. For exemple to retrieve starship or vehicles informations.


### Documentation link :

- https://swapi.dev/documentation

### **Step 1: Introduction**

- The **base URL** for SWAPI is https://swapi.dev/api/.

- SWAPI does not impose strict **rate limits** for simple usage. However, excessive requests may result in temporary blocks.

- SWAPI is open and does not require an API key or **authentication**.

- Make a **GET** request to the base URL to see all **available resources**.

In [9]:
url = "https://swapi.dev/api/"
params = {
}

response = requests.get(url, params=params)
data = response.json()
data

{'people': 'https://swapi.dev/api/people/',
 'planets': 'https://swapi.dev/api/planets/',
 'films': 'https://swapi.dev/api/films/',
 'species': 'https://swapi.dev/api/species/',
 'vehicles': 'https://swapi.dev/api/vehicles/',
 'starships': 'https://swapi.dev/api/starships/'}

### **Step 2: Retrieve Character Information**


In [None]:
url = "https://swapi.dev/api/people/"
params = {
}

characters = []
while url:
    response = requests.get(url, params=params)
    data = response.json()
    characters.extend(data['results'])
    url = data['next']

for char in characters[:5]: # Printing just 5 bc it appears long in GIT
    print(char)  


{'name': 'Luke Skywalker', 'height': '172', 'mass': '77', 'hair_color': 'blond', 'skin_color': 'fair', 'eye_color': 'blue', 'birth_year': '19BBY', 'gender': 'male', 'homeworld': 'https://swapi.dev/api/planets/1/', 'films': ['https://swapi.dev/api/films/1/', 'https://swapi.dev/api/films/2/', 'https://swapi.dev/api/films/3/', 'https://swapi.dev/api/films/6/'], 'species': [], 'vehicles': ['https://swapi.dev/api/vehicles/14/', 'https://swapi.dev/api/vehicles/30/'], 'starships': ['https://swapi.dev/api/starships/12/', 'https://swapi.dev/api/starships/22/'], 'created': '2014-12-09T13:50:51.644000Z', 'edited': '2014-12-20T21:17:56.891000Z', 'url': 'https://swapi.dev/api/people/1/'}
{'name': 'C-3PO', 'height': '167', 'mass': '75', 'hair_color': 'n/a', 'skin_color': 'gold', 'eye_color': 'yellow', 'birth_year': '112BBY', 'gender': 'n/a', 'homeworld': 'https://swapi.dev/api/planets/1/', 'films': ['https://swapi.dev/api/films/1/', 'https://swapi.dev/api/films/2/', 'https://swapi.dev/api/films/3/',

### **Step 3: Retrieve All Film Information**

In [None]:
url = "https://swapi.dev/api/films/"
params = {
}

response = requests.get(url, params=params)
films = response.json()['results']

for film in films[:5]:  # Printing just 5 bc it appears long in GIT
    print(film)


{'title': 'A New Hope', 'episode_id': 4, 'opening_crawl': "It is a period of civil war.\r\nRebel spaceships, striking\r\nfrom a hidden base, have won\r\ntheir first victory against\r\nthe evil Galactic Empire.\r\n\r\nDuring the battle, Rebel\r\nspies managed to steal secret\r\nplans to the Empire's\r\nultimate weapon, the DEATH\r\nSTAR, an armored space\r\nstation with enough power\r\nto destroy an entire planet.\r\n\r\nPursued by the Empire's\r\nsinister agents, Princess\r\nLeia races home aboard her\r\nstarship, custodian of the\r\nstolen plans that can save her\r\npeople and restore\r\nfreedom to the galaxy....", 'director': 'George Lucas', 'producer': 'Gary Kurtz, Rick McCallum', 'release_date': '1977-05-25', 'characters': ['https://swapi.dev/api/people/1/', 'https://swapi.dev/api/people/2/', 'https://swapi.dev/api/people/3/', 'https://swapi.dev/api/people/4/', 'https://swapi.dev/api/people/5/', 'https://swapi.dev/api/people/6/', 'https://swapi.dev/api/people/7/', 'https://swapi.de

### **Step 4: Retrieve All Planet Information**


In [None]:
url = "https://swapi.dev/api/planets/"
params = {
}

planets = []
while url:
    response = requests.get(url) 
    data = response.json()
    planets.extend(data['results'])
    url = data['next']

for planet in planets[:5]:  # Printing just 5 bc it appears long in GIT
    print(planet)


{'name': 'Tatooine', 'rotation_period': '23', 'orbital_period': '304', 'diameter': '10465', 'climate': 'arid', 'gravity': '1 standard', 'terrain': 'desert', 'surface_water': '1', 'population': '200000', 'residents': ['https://swapi.dev/api/people/1/', 'https://swapi.dev/api/people/2/', 'https://swapi.dev/api/people/4/', 'https://swapi.dev/api/people/6/', 'https://swapi.dev/api/people/7/', 'https://swapi.dev/api/people/8/', 'https://swapi.dev/api/people/9/', 'https://swapi.dev/api/people/11/', 'https://swapi.dev/api/people/43/', 'https://swapi.dev/api/people/62/'], 'films': ['https://swapi.dev/api/films/1/', 'https://swapi.dev/api/films/3/', 'https://swapi.dev/api/films/4/', 'https://swapi.dev/api/films/5/', 'https://swapi.dev/api/films/6/'], 'created': '2014-12-09T13:50:49.641000Z', 'edited': '2014-12-20T20:58:18.411000Z', 'url': 'https://swapi.dev/api/planets/1/'}
{'name': 'Alderaan', 'rotation_period': '24', 'orbital_period': '364', 'diameter': '12500', 'climate': 'temperate', 'gravi

### **Step 5: Search and Display Specific Character**


In [40]:
import requests

def search_character_by_name(name):
    """
    Searches for a Star Wars character by name using the SWAPI (Star Wars API).

    Args:
        name (str): The name of the character to search for (case insensitive).

    Returns:
        dict: A dictionary containing the character's details if found.
        str: An error message if the character is not found or the query is invalid.
    """
    if not isinstance(name, str) or not name.strip():
        return "Invalid query. Name must be a non-empty string."

    url = "https://swapi.dev/api/people/"
    while url:
        try:
            response = requests.get(url)
            if response.status_code == 200:
                data = response.json()
                for char in data['results']:
                    if char['name'].lower() == name.lower():
                        return char
                url = data['next']  
            else:
                return f"API error: Status code {response.status_code}"
        except requests.RequestException as e:
            return f"API request failed: {e}"
    return "Character not found."

# Example search
character = search_character_by_name("Luke Skywalker")
if isinstance(character, dict):
    print("Character Details:")
    for key, value in character.items():
        print(f"{key}: {value}")
else:
    print(character)


Character Details:
name: Luke Skywalker
height: 172
mass: 77
hair_color: blond
skin_color: fair
eye_color: blue
birth_year: 19BBY
gender: male
homeworld: https://swapi.dev/api/planets/1/
films: ['https://swapi.dev/api/films/1/', 'https://swapi.dev/api/films/2/', 'https://swapi.dev/api/films/3/', 'https://swapi.dev/api/films/6/']
species: []
vehicles: ['https://swapi.dev/api/vehicles/14/', 'https://swapi.dev/api/vehicles/30/']
starships: ['https://swapi.dev/api/starships/12/', 'https://swapi.dev/api/starships/22/']
created: 2014-12-09T13:50:51.644000Z
edited: 2014-12-20T21:17:56.891000Z
url: https://swapi.dev/api/people/1/


In [42]:
import unittest

def run_tests():
    class TestSearchCharacterByName(unittest.TestCase):

        def test_valid_character(self):
            """Test that a valid character name returns correct details."""
            character = search_character_by_name("Luke Skywalker")
            self.assertIsInstance(character, dict)
            self.assertEqual(character['name'], "Luke Skywalker")

        def test_nonexistent_character(self):
            """Test that a nonexistent character name returns 'Character not found.'"""
            character = search_character_by_name("Unknown Character")
            self.assertEqual(character, "Character not found.")

        def test_invalid_query(self):
            """Test that invalid inputs are handled correctly."""
            character = search_character_by_name("")
            self.assertEqual(character, "Invalid query. Name must be a non-empty string.")

            character = search_character_by_name(None)
            self.assertEqual(character, "Invalid query. Name must be a non-empty string.")

            character = search_character_by_name(123)
            self.assertEqual(character, "Invalid query. Name must be a non-empty string.")

    # Run the tests
    unittest.TextTestRunner().run(unittest.TestLoader().loadTestsFromTestCase(TestSearchCharacterByName))


run_tests()

...
----------------------------------------------------------------------
Ran 3 tests in 3.216s

OK


### Postman a powerfull tool for

--------------------------

# Exploring GraphQL APIs

Usefull links:
- https://graphql.org/learn/queries/
- https://graphql-demo.mead.io/

Use this graphQL API to make complex requests on Star Wars world:
- https://swapi-graphql.netlify.app/

On the below cell you have a simple graphQL query.

# Exploring Star Wars Data with GraphQL

### Introduction 

In this exercice you will retrieve the previous results in another way, by consuming the GraphQL API of SWAPI.

### Few elements to remember about the GraphQL Protocol

GraphQL is a powerful query language for APIs that provides a more efficient and flexible alternative to traditional REST APIs. In this exercise, we will interact with the Star Wars API (SWAPI) using GraphQL to retrieve specific information about characters, films, and species from the Star Wars universe. Some important aspects are:

- **Single Endpoint:** GraphQL APIs typically have a single endpoint for all queries, making it more straightforward to manage and interact with.

- **Flexible Responses:** Clients receive exactly the data they request, reducing over-fetching of data common in traditional REST APIs.

- **Introspection:** GraphQL supports introspection, allowing clients to query the schema itself, making it self-documenting and aiding in development.

### Key Concepts in GraphQL:

- **GraphQL Schema:** GraphQL APIs have a schema that defines the types of data available and the relationships between them.

- **Queries:** In GraphQL, clients specify the exact data they need using queries, allowing for more efficient data retrieval.

- **Fields and Nested Structures:** Queries can include specific fields, and GraphQL supports nested structures to retrieve related data in a single request.


### Objective

- **Step 1: Introduction:** Understand the REST API Query. You can use the playground for this : https://swapi-graphql.netlify.app/?query=%7B%0A%20%20allFilms%20%7B%0A%20%20%20%20edges%20%7B%0A%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20id%2C%0A%20%20%20%20%20%20%20%20title%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D

- **Step 2: Retrieve Films with Character Information:** Retrieve Films with Character Information in a single query.


### Documentation link :

- https://swapi.dev/documentation

In [None]:
import requests

url = "https://swapi-graphql.netlify.app/.netlify/functions/index"
body = """
query {
  allFilms {
    edges {
      node {
        title
      }
    }
  }
}
"""

response = requests.get(url=url, json={"query": body})
print("response status code: ", response.status_code)
if response.status_code == 200:
  print("response : ", response.json())

---------------------------