# Buidling the FastAPI Backend using Langserve

Let's first review what we have done so far in order to deploy our Smart Bot:

1) **Notebook 12**: Instructions on how to the deploy a Backend API using the Azure Bot Service
2) **Notebook 13**: Instructions on how to interface/talk to the Bot Service programatically using POST requests

These are the pros and cons of using the Bot Service:

**Pros**:
- Easy to connect to multiple channels, including O365 emails, MS Teams, web chat plugging, etc.
- The Bot Framework python SDKs give us a lot of utilities like Typing indicator, pro-active messages, cards, file upload, etc. 
- Provides Authentication and logging mechanism without us to do much work
- It has SDKs for Python, JavaScript and .NET
- Includes easy connection with Application Insights Service for app monitoring
- As other Microsoft services, you get Microsoft product team and support teams behind it


**Cons**:
- It doesn't support streaming (yet)
- It doesn't support private endpoint
- Has a steeper learning curve to learn all its capabilities


So, as an alternative, in this Notebook we are going to build another Backend API in the same Azure App Service, this time using FastAPI with LangServe.<br>
From the [LANGSERVE DOCUMENTATION](https://python.langchain.com/docs/langserve):

    LangServe helps developers deploy LangChain runnables and chains as a REST API.

    This library is integrated with FastAPI and uses pydantic for data validation.

    In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

## The main file: Server.py

Just as the main code in the Bot Service API resides in bot.py, in this FastAPI backend, the main code resides in `apps/backend/langserve/app/server.py`

**Take a look at it!**

In `server.py` you will see that we created 4 endpoints:

- `/docs/` 
  - This endpoint shows the OpenAPI definition (Swagger) of the API
- `/chatgpt/`
  - This endpoint uses a simple LLM to answer with no system prompt
- `/joke/`
  - This endpoint uses chain with a LLM + prompt + a custom structured json output (adds the timestamp of the server)
- `/agent/`
  - This is our the endpoint for our SMART GPT Bot brain agent 
  
For every endpoint all these routes are available: `/invoke/`, `/batch/`, `/stream/` and `/stream_events/`

## Deploy in Azure App service

In `apps/backend/langserve/README.md` you will find all the instructions on how to Zip the code and upload it to the Azure Web App. We will be using the same Azure Web App Service created for the Bot Service API.

=> **GO AHEAD NOW AND FOLLOW THE INSTRUCTIONS in `apps/backend/langserve/README.md`**

## (optional) Deploy the server locally

1) Go to the file `apps/backend/langserve/app/server.py` and uncomment the following code to test locally:
```python
    ### uncomment this section to run server in local host #########

    # from pathlib import Path
    # from dotenv import load_dotenv
    # # Calculate the path three directories above the current script
    # library_path = Path(__file__).resolve().parents[4]
    # sys.path.append(str(library_path))
    # load_dotenv(str(library_path) + "/credentials.env")
    # os.environ["AZURE_OPENAI_MODEL_NAME"] = os.environ["GPT35_DEPLOYMENT_NAME"]

    ###################################
```
2) Open a terminal, activate the right conda environment, then go to this folder `apps/backend/langserve/app` and run this command:
    
```bash
python server.py
```

Alternatively, you can go to this folder `apps/backend/langserve/` and run this command:
```bash
langchain serve
```

This will run the backend server API in localhost port 8000. 

3) If you are working on an Azure ML compute instance you can access the OpenAPI (Swagger) definition in this address:

    https:\<your_compute_name\>-8000.\<your_region\>.instances.azureml.ms/
    
    for example:
    https://pabmar1-8000.australiaeast.instances.azureml.ms/

## Talk to the API using POST requests

In [2]:
import requests
import json
import sys
import time
import random

### Functions to post and read responses from the API. It supports streaming!!

In [3]:
def process_line(line):
    """Process a single line from the stream."""
    # print("line:",line)
    if line.startswith('data: '):
        # Extract JSON data following 'data: '
        json_data = line[len('data: '):]
        try:
            data = json.loads(json_data)
            if "event" in data:
                handle_event(data)
            elif "content" in data:
                # If there is immediate content to print
                print(data["content"], end="", flush=True)
            elif "steps" in data:
                print(data["steps"])
            elif "output" in data:
                print(data["output"])
        except json.JSONDecodeError as e:
            print(f"JSON decoding error: {e}")
    elif line.startswith('event: '):
        pass
    elif ": ping" in line:
        pass
    else:
        print(line)

def handle_event(event):
    """Handles specific events, adjusting output based on event type."""
    kind = event["event"]
    if kind == "on_chain_start" and event["name"] == "AgentExecutor":
        print(f"Starting agent: {event['name']}")
    elif kind == "on_chain_end" and event["name"] == "AgentExecutor":
        print("\n--")
        print(f"Done agent: {event['name']}")
    elif kind == "on_chat_model_stream":
        content = event["data"]["chunk"]["content"]
        if content:  # Ensure content is not None or empty
            print(content, end="", flush=True)
    elif kind == "on_tool_start":
        # Assuming event['data'].get('input') is a dictionary
        tool_inputs = event['data'].get('input')
        if isinstance(tool_inputs, dict):
            # Joining the dictionary into a string format key: 'value'
            inputs_str = ", ".join(f"'{v}'" for k, v in tool_inputs.items())
        else:
            # Fallback if it's not a dictionary or in an unexpected format
            inputs_str = str(tool_inputs)
        print(f"Starting tool: {event['name']} with input: {inputs_str}")
    elif kind == "on_tool_end":
        print(f"Done tool: {event['name']}\n--")

    
def consume_api(url, payload):
    """Uses requests POST to talkt to the FastAPI backend, supports streaming"""
    
    headers = {'Content-Type': 'application/json'}
    
    with requests.post(url, json=payload, headers=headers, stream=True) as response:
        try:
            response.raise_for_status()  # Raises a HTTPError if the response is not 200
            
            for line in response.iter_lines():
                if line:  # Check if the line is not empty
                    decoded_line = line.decode('utf-8')
                    process_line(decoded_line)
                    
                    
        except requests.exceptions.HTTPError as err:
            print(f"HTTP Error: {err}")
        except Exception as e:
            print(f"An error occurred: {e}")


### Base URL

In [11]:
base_url = "https://<YOUR_BACKEND_WEBAPP_NAME>-staging.azurewebsites.net"  # Note that "-staging" is the Azure App Service slot where the LangServe API was deployed
# base_url = "http://localhost:8000" # If you deployed locally
base_url = "https://webapp-backend-botid-zf4fwhz3gdn64-slot1.azurewebsites.net"

### `/chatgpt/` endpoint

In [12]:
payload = {'input': 'explain long covid in just 2 short sentences'}  # Your POST request payload

In [13]:
# URL of the FastAPI Invoke endpoint
url = base_url + '/chatgpt/invoke'
consume_api(url, payload)

{"output":{"content":"Long COVID is a condition where people experience lingering symptoms for weeks or months after recovering from COVID-19. Symptoms can include fatigue, shortness of breath, and brain fog.","additional_kwargs":{},"response_metadata":{"token_usage":{"completion_tokens":35,"prompt_tokens":16,"total_tokens":51},"model_name":"gpt-35-turbo","system_fingerprint":"fp_2f57f81c11","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"finish_reason":"stop","logprobs":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}},"type":"ai","name":null,"id":"run-568a70a5-3ab4-4de2-baea-6f01bfbe6dd3-0","example":false},"

In [14]:
# URL of the FastAPI streaming endpoint
url = base_url + '/chatgpt/stream'
consume_api(url, payload)

Long COVID is a condition where individuals continue to experience symptoms of COVID-19 for weeks or even months after the initial infection. These symptoms can include fatigue, shortness of breath, and brain fog, among others.

### `/joke` endpoint : chain with custom output

In [15]:
payload = {'input': {"topic": "highschool", "language":"english"}}

url = base_url + '/joke/invoke'

consume_api(url, payload)

{"output":{"content":"Why did the math book look so sad in high school?\n\nBecause it had too many problems!","info":{"timestamp":"2024-04-13T22:11:39.359008"}},"callback_events":[],"metadata":{"run_id":"fe50b1f2-94c7-4b83-b230-164ff22205fa"}}


In [16]:
# URL of the FastAPI streaming endpoint
url = base_url + '/joke/stream_events'

consume_api(url, payload)

Why did the high school student bring a ladder to class?
Because he wanted to reach new heights in his education!

### `/agent` endpoint : our complex smart bot

In [17]:
random_session_id = "session"+ str(random.randint(1, 1000))
ramdom_user_id = "user"+ str(random.randint(1, 1000))

config={"configurable": {"session_id": random_session_id, "user_id": ramdom_user_id}}
print(random_session_id, ramdom_user_id)

session861 user614


In [18]:
payload = {'input': {"question": "Hi, I am Pablo, what is your name?"}, 'config': config}
 
url = base_url + '/agent/invoke'

consume_api(url, payload)

{"output":{"output":"Hello Pablo, I'm Jarvis. How can I assist you today?"},"callback_events":[],"metadata":{"run_id":"b2639f28-1518-4bb8-9890-c3541a755288"}}


In [19]:
payload = {'input': {"question": "docsearch, what is CLP?"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: docsearch with input: 'CLP'
Done tool: docsearch
--
I have found several references related to "CLP" based on different contexts. Here are some of the references related to "CLP" that were found:

1. **Constraint Logic Programming (CLP):** This refers to a powerful extension of conventional logic programming that incorporates constraint languages and constraint solving methods into logic programming languages. It provides a framework for a logic programming language that is parametrized with respect to constraint language and a domain of computation, yielding soundness and completeness results for an operational semantics relying on a constraint solver for the employed constraint language<sup><a href="https://datasetsgptsmartsearch.blob.core.windows.net/arxivcs/pdf/0008/0008036v1.pdf?sv=2022-11-02&ss=b&srt=sco&sp=rl&se=2026-01-03T02:11:44Z&st=2024-01-02T18:11:44Z&spr=https&sig=ngrEqvqBVaxyuSYqgPVeF%2B9c0fXLs94v3ASgwg7LDBs%3D">source</a></sup

In [20]:
payload = {'input': {"question": "bing, give me the current salary of registerd nurse and of dental hygenist in texas"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: bing with input: 'current salary of registered nurse in Texas'
Starting tool: bing with input: 'current salary of dental hygienist in Texas'
Done tool: bing
--
Done tool: bing
--
The average salary of a registered nurse (RN) in Texas is approximately $69,091 per year, with a range typically falling between $61,955 and $79,025. However, the specific salary can vary widely depending on the city, education, certifications, additional skills, and years of experience<sup><a href="https://www.salary.com/research/salary/listing/nurse-rn-salary/tx" target="_blank">[1]</a></sup>.

For dental hygienists in Texas, the average salary varies based on different sources. Here are some estimates:

- Indeed reports an average salary of $46.69 per hour, based on 2.8k salaries reported as of April 1, 2024<sup><a href="https://www.indeed.com/career/dental-hygienist/salaries/TX" target="_blank">[1]</a></sup>.
- Salary.com indicates an average salary of $81,346 a

In [25]:
payload = {'input': {"question": "docsearch, How Covid affects obese people? and elderly"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: docsearch with input: 'How Covid affects obese people?'
Starting tool: docsearch with input: 'How Covid affects elderly'
Done tool: docsearch
--
Done tool: docsearch
--
Here are the key findings on how COVID-19 affects obese individuals and the elderly:

### Obesity and COVID-19
1. **Increased Risk of Severe COVID-19**: Obese patients have a significantly higher risk of progressing to severe COVID-19, with men who are obese having even higher odds of developing severe disease.
2. **Impact on Critical Care Units**: A large proportion of patients in critical care units were either overweight or obese, indicating a substantial impact of obesity in seriously ill COVID-19 patients.
3. **Mechanistic Framework**: Adipose tissue in individuals with obesity may contribute to more extensive viral spread, immune activation, and cytokine amplification, leading to a more severe outcome for obese individuals with COVID-19.
4. **Clinical Guidelines and Rec

In [26]:
payload = {'input': {"question": "sqlsearch, how many people were hospitalized in 2020?"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
Starting tool: sqlsearch with input: 'SELECT SUM(hospitalized) FROM covid_data WHERE year = 2020'
Done tool: sqlsearch
--
In 2020, the total number of hospitalized cases due to COVID-19 was 68,436,666. This data reflects the significant impact of the pandemic on healthcare systems and the need for extensive medical care for those affected.

If you have any more questions or need further information, feel free to ask!
--
Done agent: AgentExecutor


In [27]:
payload = {'input': {"question": "thank you!"}, 'config': config}
 
url = base_url + '/agent/stream_events'

consume_api(url, payload)

Starting agent: AgentExecutor
You're welcome! If you have any more questions in the future, feel free to ask. Take care!
--
Done agent: AgentExecutor


## Now let's try all endpoints and routes using langchain local RemoteRunnable

All these are also available in TypeScript, see LangServe.JS documentation

In [28]:
from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable

chatgpt_chain = RemoteRunnable(base_url + "/chatgpt/")
joke_chain = RemoteRunnable(base_url + "/joke/")
agent_chain = RemoteRunnable(base_url + "/agent/")


In [29]:
joke_chain.invoke({"topic": "cars", "language":"english"})

{'content': 'Why did the car go to therapy?\nBecause it had too many issues with its transmission!',
 'info': {'timestamp': '2024-04-13T22:18:57.074376'}}

In [30]:
# or async
await joke_chain.ainvoke({"topic": "parrots", "language":"spanish"})

{'content': '¿Por qué los loros nunca están estresados? Porque siempre están "pico y alegría".',
 'info': {'timestamp': '2024-04-13T22:19:05.397972'}}

In [34]:
prompt = [
    SystemMessage(content='you are a helpful assistant that responds to the user question.'),
    HumanMessage(content='explain long covid')
]

# Supports astream
async for msg in chatgpt_chain.astream(prompt):
    print(msg["content"], end="", flush=True)

Long COVID, also known as post-acute sequelae of SARS-CoV-2 infection (PASC), refers to a range of symptoms that persist for weeks or months after the acute phase of a COVID-19 infection has resolved. These symptoms can include fatigue, shortness of breath, chest pain, joint pain, and brain fog, among others. Long COVID can affect individuals who had mild, moderate, or severe initial COVID-19 infections, and the exact cause of these persistent symptoms is still being studied. It's important for individuals experiencing long COVID symptoms to work with healthcare providers to manage and treat their ongoing symptoms.

In [38]:
async for event in agent_chain.astream_events({"question": "bing, give me the current salary of registerd nurse and of dental hygenist in texas"}, config=config, version="v1"):
    kind = event["event"]
    if kind == "on_chain_start":
        if (event["name"] == "AgentExecutor"):  
            print(f"Starting agent: {event['name']}")
    elif kind == "on_chain_end":
        if (event["name"] == "AgentExecutor"):
            print()
            print("--")
            print(f"Done agent: {event['name']}")
    if kind == "on_chat_model_stream":
        content = event["data"]["chunk"]["content"]
        if content:
            print(content, end="", flush=True)
    elif kind == "on_tool_start":
        print("--")
        print(f"Starting tool: {event['name']} with inputs: {event['data'].get('input')}")
    elif kind == "on_tool_end":
        print(f"Done tool: {event['name']}")
        # print(f"Tool output was: {event['data'].get('output')}")
        print("--")

Starting agent: AgentExecutor
--
Starting tool: bing with inputs: {'query': 'average salary of registered nurse in Texas'}
Done tool: bing
--
--
Starting tool: bing with inputs: {'query': 'average salary of dental hygienist in Texas'}
Done tool: bing
--
I have provided the average salaries for registered nurses and dental hygienists in Texas based on various sources. If you need more specific information or have any other questions, feel free to ask!
--
Done agent: AgentExecutor


In [37]:
async for event in agent_chain.astream_events({"question": "bing, give me the current salary of registerd nurse and of dental hygenist in texas"}, config=config, version="v1"):
    print(event)

{'event': 'on_chain_start', 'run_id': '5806b431-5095-4eb3-bdc5-5d948b59504b', 'name': '/agent', 'tags': [], 'metadata': {'session_id': 'session861', 'user_id': 'user614'}, 'data': {'input': {'question': 'bing, give me the current salary of registerd nurse and of dental hygenist in texas'}}}
{'event': 'on_chain_start', 'name': 'insert_history', 'run_id': 'db4045b9-d75b-466d-b14a-6002efec097e', 'tags': ['seq:step:1'], 'metadata': {'session_id': 'session861', 'user_id': 'user614'}, 'data': {}}
{'event': 'on_chain_stream', 'name': 'insert_history', 'run_id': 'db4045b9-d75b-466d-b14a-6002efec097e', 'tags': ['seq:step:1'], 'metadata': {'session_id': 'session861', 'user_id': 'user614'}, 'data': {'chunk': {'question': 'bing, give me the current salary of registerd nurse and of dental hygenist in texas'}}}
{'event': 'on_chain_start', 'name': 'RunnableParallel<history>', 'run_id': '6b62816c-0553-43af-a2e6-f3ce6ffee84e', 'tags': [], 'metadata': {'session_id': 'session861', 'user_id': 'user614'}, 