# Buidling the FastAPI Backend using Langserve

Let's first review what we have done so far in order to deploy our Smart Bot:

1) **Notebook 12**: Instructions on how to the deploy a Backend API using the Azure Bot Service
2) **Notebook 13**: Instructions on how to interface/talk to the Bot Service programatically using POST requests

These are the pros and cons of using the Bot Service:

**Pros**:
- Easy to connect to multiple channels, including O365 emails, MS Teams, web chat plugging, etc.
- The Bot Framework python SDKs give us a lot of utilities like Typing indicator, pro-active messages, cards, file upload, etc. 
- Provides Authentication and logging mechanism without us to do much work
- As other Microsoft Service, you get Microsoft product team and support teams behind it

**Cons**:
- It doesn't support streaming (yet)
- Has a steeper learning curve to learn all its capabilities


So, as an alternative, in this Notebook we are going to build another Backend API, this time using FastAPI with LangServe.
From the [LANGSERVE DOCUMENTATION](https://python.langchain.com/docs/langserve):

    LangServe helps developers deploy LangChain runnables and chains as a REST API.

    This library is integrated with FastAPI and uses pydantic for data validation.

    In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

## The main file: Server.py

Just as the main code in the Bot Service API resides in bot.py, in this FastAPI backend, the main code resides in `apps/backend/langserve/app/server.py`

**Take a look at it!**

In `server.py` you will see that we created 4 endpoints:

- `/chatgpt/`
  - This endpoint uses a simple LLM to answer with no system prompt
- `/joke/`
  - This endpoint uses chain with a GPT3.5 model + prompt + a custom json output (adds the timestamp of the server)
- `/agent/`
  - This is our the endpoint for our SMART GPT Bot brain agent 
  
For every endpoint all these routes are available: `/invoke/`, `/batch/`, `/stream/` and `/stream_events/`

## Deploy in Azure App service

In `apps/backend/langserve/README.md` you will find all the instructions on how to Zip the code and upload it to the Azure Web App. We will be using the same Azure Web App Service created for the Bot Service API.

=> GO AHEAD NOW AND FOLLOW THE INSTRUCTIONS in `apps/backend/langserve/README.md`

## (optional) Deploy the server locally

1) Go to the file `apps/backend/langserve/app/server.py` and uncomment the following code to test locally:
```python
    ### uncomment this section to run server in local host #########

    # from pathlib import Path
    # from dotenv import load_dotenv
    # # Calculate the path three directories above the current script
    # library_path = Path(__file__).resolve().parents[4]
    # sys.path.append(str(library_path))
    # load_dotenv(str(library_path) + "/credentials.env")
    # os.environ["AZURE_OPENAI_MODEL_NAME"] = os.environ["GPT35_DEPLOYMENT_NAME"]

    ###################################
```
2) Open a terminal, activate the right conda environment, then go to this folder `apps/backend/langserve/app` and run this command:
    
```bash
python server.py
```

Alternatively, you can go to this folder `apps/backend/langserve/` and run this command:
```bash
langchain serve
```

This will run the backend server API in localhost port 8000. 

3) If you are working on an Azure ML compute instance you can access the OpenAPI (Swagger) definition in this address:

    https:<your_compute_name>-8000.<your_region>.instances.azureml.ms/
    
    for example:
    https://pabmar1-8000.australiaeast.instances.azureml.ms/

## Talk to the API using POST requests

In [1]:
import requests
import json
import sys
import time
import random

### Functions to post and read responses from the API. It supports streaming!!

In [2]:
def process_line(line):
    """Process a single line from the stream."""
    # print("line:",line)
    if line.startswith('data: '):
        # Extract JSON data following 'data: '
        json_data = line[len('data: '):]
        try:
            data = json.loads(json_data)
            if "event" in data:
                handle_event(data)
            elif "content" in data:
                # If there is immediate content to print
                print(data["content"], end="", flush=True)
            elif "steps" in data:
                print(data["steps"])
            elif "output" in data:
                print(data["output"])
        except json.JSONDecodeError as e:
            print(f"JSON decoding error: {e}")
    elif line.startswith('event: '):
        pass
    elif ": ping" in line:
        pass
    else:
        print(line)

def handle_event(event):
    """Handles specific events, adjusting output based on event type."""
    kind = event["event"]
    if kind == "on_chain_start" and event["name"] == "AgentExecutor":
        print(f"Starting agent: {event['name']}")
    elif kind == "on_chain_end" and event["name"] == "AgentExecutor":
        print("\n--")
        print(f"Done agent: {event['name']}")
    elif kind == "on_chat_model_stream":
        content = event["data"]["chunk"]["content"]
        if content:  # Ensure content is not None or empty
            print(content, end="", flush=True)
    elif kind == "on_tool_start":
        print(f"Starting tool: {event['name']} with inputs: {event['data'].get('input')}")
    elif kind == "on_tool_end":
        print(f"Done tool: {event['name']}\n--")

    
def consume_api(url, payload):
    """Uses requests POST to talkt to the FastAPI backend, supports streaming"""
    
    headers = {'Content-Type': 'application/json'}
    
    with requests.post(url, json=payload, headers=headers, stream=True) as response:
        try:
            response.raise_for_status()  # Raises a HTTPError if the response is not 200
            
            for line in response.iter_lines():
                if line:  # Check if the line is not empty
                    decoded_line = line.decode('utf-8')
                    process_line(decoded_line)
                    
                    
        except requests.exceptions.HTTPError as err:
            print(f"HTTP Error: {err}")
        except Exception as e:
            print(f"An error occurred: {e}")


### Base URL

In [80]:
base_url = "https://<YOUR_BACKEND_WEBAPP_NAME>.azurewebsites.net"
# base_url = "http://localhost:8000" # If you deployed locally

### `/chatgpt/` endpoint

In [81]:
payload = {'input': 'explain long covid'}  # Your POST request payload

In [82]:
# URL of the FastAPI Invoke endpoint
url = base_url + '/chatgpt/invoke'
consume_api(url, payload)

{"output":{"content":"Long COVID, also known as post-acute sequelae of SARS-CoV-2 infection (PASC), refers to a range of symptoms that persist for weeks or months after the acute phase of a COVID-19 infection. These symptoms can affect multiple systems in the body, including the respiratory, cardiovascular, neurological, and psychological systems. Common symptoms of long COVID include fatigue, shortness of breath, chest pain, joint pain, and cognitive difficulties.\n\nThe exact cause of long COVID is not fully understood, but it is believed to result from a combination of factors, including the lingering effects of the initial viral infection, immune system dysregulation, and potential damage to organs and tissues. Long COVID can occur in individuals who had mild, moderate, or severe COVID-19 infections, and it can affect people of all ages.\n\nThe impact of long COVID can be debilitating and significantly affect a person's quality of life, ability to work, and overall well-being. Trea

In [84]:
# URL of the FastAPI streaming endpoint
url = base_url + '/chatgpt/stream'
consume_api(url, payload)

Long COVID, also known as post-acute sequelae of SARS-CoV-2 infection (PASC), refers to a range of symptoms that persist for weeks or months after the acute phase of a COVID-19 infection has resolved. These symptoms can include fatigue, shortness of breath, chest pain, joint pain, and brain fog, among others. Long COVID can affect people who had mild or severe initial COVID-19 symptoms, and it can significantly impact their quality of life and ability to carry out daily activities.

The exact cause of long COVID is not fully understood, but it is believed to be related to the body's immune response to the virus, as well as potential long-term damage to organs and tissues caused by the infection. There is currently no specific treatment for long COVID, and management typically involves addressing individual symptoms and providing support for patients to cope with the ongoing effects of the virus.

Research into long COVID is ongoing, and healthcare professionals are working to better un

### `/joke` endpoint : chain with custom output

In [85]:
payload = {'input': {"topic": "highschool", "language":"english"}}

url = base_url + '/joke/invoke'

consume_api(url, payload)

{"output":{"content":"Why don't high schoolers make good comedians?\n\nBecause they always have too much homework and never have time to work on their stand-up routine!","info":{"timestamp":"2024-04-04T04:55:44.590095"}},"callback_events":[],"metadata":{"run_id":"02aca420-98b4-4e09-a907-66424e90c5e4"}}


In [86]:
# URL of the FastAPI streaming endpoint
url = base_url + '/joke/stream_events'

consume_api(url, payload)

Why did the math book look so sad in high school?
Because it had too many problems.

### `/agent` endpoint : our complex smart bot

In [87]:
random_session_id = "session"+ str(random.randint(1, 1000))
ramdom_user_id = "user"+ str(random.randint(1, 1000))

config={"configurable": {"session_id": random_session_id, "user_id": ramdom_user_id}}
print(random_session_id, ramdom_user_id)

session693 user539


In [35]:
payload = {'input': {"question": "Hi, I am Pablo, what is your name?"}, 'config': config}
 
url = base_url + '/agent/invoke'

consume_api(url, payload)

{"output":{"output":"Hello Pablo, I'm Jarvis. How can I assist you today?"},"callback_events":[],"metadata":{"run_id":"247cda6a-6345-4f63-85fa-4ad67cacfd4a"}}


In [36]:
payload = {'input': {"question": "docsearch, what is CLP?"}, 'config': config}
 
url = base_url + '/agent/invoke'

consume_api(url, payload)

{"output":{"output":"I have found multiple meanings and applications for the term \"CLP.\" Here are some of the contexts in which \"CLP\" is mentioned:\n\n1. **Constraint Logic Programming (CLP):** CLP is a powerful extension of conventional logic programming that incorporates constraint languages and constraint solving methods into logic programming languages. It involves the parametrization of a logic programming language with respect to a constraint language and a domain of computation, yielding soundness and completeness results for an operational semantics relying on a constraint solver for the employed constraint language<sup><a href=\"https://datasetsgptsmartsearch.blob.core.windows.net/arxivcs/pdf/0008/0008036v1.pdf?sv=2022-11-02&ss=b&srt=sco&sp=rl&se=2026-01-03T02:11:44Z&st=2024-01-02T18:11:44Z&spr=https&sig=ngrEqvqBVaxyuSYqgPVeF%2B9c0fXLs94v3ASgwg7LDBs%3D\">source</a></sup>.\n\n2. **Constraint Logic Programming (CLP(FD)):** CLP(FD) is an extension of logic programming where l

In [38]:
payload = {'input': {"question": "bing, give me the current salary of registerd nurse and of dental hygenist in texas"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: bing with inputs: {'query': 'current salary of registered nurse in Texas'}
Starting tool: bing with inputs: {'query': 'current salary of dental hygienist in Texas'}
Done tool: bing
--
Done tool: bing
--
The average salary of a registered nurse (RN) in Texas is approximately $69,166 as of February 26, 2024. However, the salary range typically falls between $62,021 and $79,111, and it can vary widely depending on the city and other important factors such as education, certifications, additional skills, and years of experience. The starting RN salary in Texas is around $61,950, which is higher than the starting salary in many other states. The average salary for a registered nurse in Texas is reported to be $39.33 per hour, with an estimated total pay of $81,825 per year in the Texas area. The estimated average salary for a registered nurse in Texas is $79,516 after adjusting for the cost of living, and the state will need 258,720 new registere

In [53]:
payload = {'input': {"question": "docsearch, How Covid affects obese people and elderly"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: docsearch with inputs: {'query': 'How Covid affects obese people and elderly'}
Done tool: docsearch
--
I found some valuable information on how COVID-19 affects obese individuals and the elderly:

### Impact on Obese Individuals:
- Studies have shown that obesity is highly frequent among critically ill patients with COVID-19, and it remains challenging to understand the mechanisms by which COVID-19 severity is increased in the context of obesity<sup><a href="https://doi.org/10.1002/oby.22867">[1]</a></sup>.
- Two-thirds of people who developed serious or fatal COVID-19-related complications were overweight or obese, according to a study by the UK Intensive Care National Audit and Research Centre<sup><a href="https://doi.org/10.1002/oby.22844">[2]</a></sup>.
- Obese patients have been found to have increased odds of progressing to severe COVID-19, and clinicians should pay close attention to obese patients, who should be carefully managed wit

In [42]:
payload = {'input': {"question": "sqlsearch, how many people were hospitalized in 2020?"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: sqlsearch with inputs: {'query': 'SELECT SUM(hospitalized) AS total_hospitalized FROM health_data WHERE year = 2020'}
Done tool: sqlsearch
--
The total number of hospitalized cases in 2020 was 68,436,666. This data was obtained from the `covidtracking` table, and the SQL query used to calculate this value was:

```sql
SELECT SUM(hospitalized) AS total_hospitalized FROM covidtracking WHERE date LIKE '2020%'
```

If you have any more questions or need further assistance, feel free to ask!
--
Done agent: AgentExecutor


In [54]:
payload = {'input': {"question": "thank you!"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
You're welcome, Pablo! If you have any more questions in the future or need further assistance, don't hesitate to reach out. Stay safe and take care!
--
Done agent: AgentExecutor


## Now let's try all endpoints and routes using langchain local RemoteRunnable

All these are also available in TypeScript, see LangServe documentation

In [57]:
from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable

chatgpt_chain = RemoteRunnable(base_url + "/chatgpt/")
joke_chain = RemoteRunnable(base_url + "/joke/")
agent_chain = RemoteRunnable(base_url + "/agent/")


In [58]:
joke_chain.invoke({"topic": "cars", "language":"english"})

{'content': 'Why do cars never get lost? Because they always know the way to the nearest garage!',
 'info': {'timestamp': '2024-04-04T04:37:14.556296'}}

In [59]:
# or async
await joke_chain.ainvoke({"topic": "parrots", "language":"spanish"})

{'content': '¿Por qué los loros son malos contadores?\nPorque siempre están haciendo "cuatro" con las cuentas.',
 'info': {'timestamp': '2024-04-04T04:37:19.456951'}}

In [60]:
prompt = [
    SystemMessage(content='you are a helpful assistant that responds to the user question.'),
    HumanMessage(content='explain long covid')
]

# Supports astream
async for msg in chatgpt_chain.astream(prompt):
    print(msg.content, end="", flush=True)

Long COVID, also known as post-acute sequelae of SARS-CoV-2 infection (PASC), refers to a condition where individuals continue to experience symptoms and complications of COVID-19 for an extended period of time, typically lasting beyond 12 weeks after the initial infection. Common symptoms of long COVID can include fatigue, shortness of breath, chest pain, joint pain, and brain fog, among others. The exact cause and mechanism of long COVID are not fully understood, and it can affect individuals who had both severe and mild initial COVID-19 infections. Long COVID can have a significant impact on a person's quality of life and may require ongoing medical care and support.

In [61]:
async for event in agent_chain.astream_events({"question": "sqlsearch, how many people were hospitalized in 2020?"}, config=config, version="v1"):
    kind = event["event"]
    if kind == "on_chain_start":
        if (
            event["name"] == "AgentExecutor"
        ):  # Was assigned when creating the agent with `.with_config({"run_name": "Agent"})`
            print(
                f"Starting agent: {event['name']}"
            )
    elif kind == "on_chain_end":
        if (
            event["name"] == "AgentExecutor"
        ):  # Was assigned when creating the agent with `.with_config({"run_name": "Agent"})`
            print()
            print("--")
            print(
                f"Done agent: {event['name']}"
            )
    if kind == "on_chat_model_stream":
        content = event["data"]["chunk"].content
        if content:
            # Empty content in the context of OpenAI means
            # that the model is asking for a tool to be invoked.
            # So we only print non-empty content
            print(content, end="", flush=True)
    elif kind == "on_tool_start":
        print("--")
        print(
            f"Starting tool: {event['name']} with inputs: {event['data'].get('input')}"
        )
    elif kind == "on_tool_end":
        print(f"Done tool: {event['name']}")
        # print(f"Tool output was: {event['data'].get('output')}")
        print("--")

Starting agent: AgentExecutor
--
Starting tool: sqlsearch with inputs: {'query': 'how many people were hospitalized in 2020?'}
Done tool: sqlsearch
--
The total number of people hospitalized in 2020 was 68,436,666. This data was obtained from the `covidtracking` table, and the SQL query used to calculate this value was:

```sql
SELECT SUM(hospitalized) AS total_hospitalized FROM covidtracking WHERE date LIKE '2020%'
```

If you have any more questions or need further assistance, feel free to ask!
--
Done agent: AgentExecutor
