# How to use LangChain and Azure OpenAI with Python


Langchain is an open source framework for developing applications using large language models (LLM). <br>

This guide will demonstrate how to setup and use Azure OpenAI models' API with LangChain.

In this case the example is tailored to answer questions related to an HTML/DOM tree about its fields. 

To do that we start first trying the library **lxml** <a href="https://lxml.de/installation.html">(check here the documentation)</a>.
    

## Set Up
The following libraries must be installed to use LangChain with Azure OpenAI.<br>

In [None]:
# INSTALLATION
%pip install openai
%pip install langchain
%pip install langchain-openai


# used to parse the html into the dom
%pip install lxml

## API Configuation and Deployed Model Setup

After installing the necessary libraies, the API must be configured. The code below shows how to configure the API directly in your Python environment. 


In [7]:
# This is only for troubleshooting purposes if there are issues with versions
%pip show openai
%pip show langchain
%pip show langchain-openai


Name: openai
Version: 1.60.0
Summary: The official Python library for the openai API
Home-page: 
Author: 
Author-email: OpenAI <support@openai.com>
License: 
Location: /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages
Requires: anyio, distro, httpx, jiter, pydantic, sniffio, tqdm, typing-extensions
Required-by: langchain-openai
Note: you may need to restart the kernel to use updated packages.
Name: langchain
Version: 0.3.15
Summary: Building applications with LLMs through composability
Home-page: https://github.com/langchain-ai/langchain
Author: 
Author-email: 
License: MIT
Location: /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages
Requires: aiohttp, langchain-core, langchain-text-splitters, langsmith, numpy, pydantic, PyYAML, requests, SQLAlchemy, tenacity
Required-by: 
Note: you may need to restart the kernel to use updated packages.
Name: langchain-openai
Version: 0.3.1
Summary: An integration package connecting OpenAI and

In [8]:
import openai
import json
import os
# Please note that AzureChatOpenAI langchain is not anymore in langchain.chat-models but in langchain_openai
from langchain_openai import AzureChatOpenAI
from langchain.schema import HumanMessage
from langchain import LLMChain


# Load config values
with open(r'config.json') as config_file:
    config_details = json.load(config_file)

# The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
openai_api_base=config_details['OPENAI_API_BASE']
    
# API version e.g. "2023-07-01-preview"
openai_api_version=config_details['OPENAI_API_VERSION']

# The name of your Azure OpenAI deployment chat model. e.g. "gpt-35-turbo-0613"
deployment_name=config_details['DEPLOYMENT_NAME']

# The API key for your Azure OpenAI resource.
openai_api_key = config_details['OPENAI_API_KEY']
# openai_api_key = os.getenv("OPENAI_API_KEY")

# This is set to `azure`
openai_api_type="azure"

## Deployed Model Setup

In [9]:
# Create an instance of chat llm
llm = AzureChatOpenAI(
    azure_endpoint=openai_api_base,    
    openai_api_version=openai_api_version,
    deployment_name=deployment_name,    
    api_key=openai_api_key,
    openai_api_type=openai_api_type,
)

llm.invoke([HumanMessage(content="Write me a poem")])

AIMessage(content="In the hush of dawn's first light,  \nWhere shadows dance in soft delight,  \nThe world awakens, calm and still,  \nWith whispers of a day's goodwill.  \n\nThe sun climbs high, a golden crown,  \nOver fields and bustling towns,  \nTouching hearts and dreams anew,  \nPainting skies a vibrant blue.  \n\nA gentle breeze begins to play,  \nSweeping the worries of the day,  \nIts tender touch, a sweet caress,  \nBringing peace and tenderness.  \n\nAs twilight dons her purple gown,  \nStars sprinkle light on sleepy towns,  \nA lullaby in silent night,  \nGuiding dreams in tranquil flight.  \n\nIn every moment, life unfolds,  \nA tapestry of stories told,  \nIn the heart of nature's rhyme,  \nWe find our place, our space in time.  ", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 169, 'prompt_tokens': 11, 'total_tokens': 180, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0,

## Load an HTML file

In [10]:
from pathlib import Path

from IPython.display import display, HTML

notebook_path = os.getcwd()

simple_form_subpath = "html-files/simple-form/index.html"

simple_form_path=f"{notebook_path}/{simple_form_subpath}"

html_form_content = Path(simple_form_path).read_text()

# This is only to display it
display(HTML(html_form_content))


## Load an HTML file without any form

In [12]:
from utils import load_file

html_no_forms_content = load_file("html-files/html-no-forms/index.html")

# This is only to display it
display(HTML(html_no_forms_content))

# Given an HTML page, obtain its DOM

This should allow to get the DOM from an input HTML.

In [15]:
from lxml import etree
from lxml.html import fromstring, tostring



# 1. Let's try using parse, what will happen? https://lxml.de/tutorial.html#the-parse-function
# tree = etree.parse(html_content)
# No, this does not work

# 2. Let's try using fromstring, what will happen? https://lxml.de/lxmlhtml.html
page = fromstring(f'''{html_form_content}''')

print(page) # This is just an element 

print(tostring(page)) # This is all of the page



for child in page:
    print(child.tag)


 
# TODO: at this point we should be able to do an utility function that gives a list of forms.    


<Element html at 0x123497b60>
b'<html>\n  <head>\n    <title>Contact form</title>\n    <link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.4.1/css/all.css" integrity="sha384-5sAR7xN1Nv6T6+dT2mhtzEpVJvfS3NScPQTrOxhwjIuvcA67KV2R5Jz6kr4abQsz" crossorigin="anonymous">\n    <link href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700" rel="stylesheet">\n    <style>\n      html, body {\n      min-height: 100%;\n      padding: 0;\n      margin: 0;\n      font-family: Roboto, Arial, sans-serif;\n      font-size: 14px;\n      color: #666;\n      }\n      h1 {\n      margin: 0 0 20px;\n      font-weight: 400;\n      color: #1c87c9;\n      }\n      p {\n      margin: 0 0 5px;\n      }\n      .main-block {\n      display: flex;\n      flex-direction: column;\n      justify-content: center;\n      align-items: center;\n      min-height: 100vh;\n      background: #1c87c9;\n      }\n      form {\n      padding: 25px;\n      margin: 25px;\n      box-shadow: 0 2px 5px #f5

## Simple Prompt With an input HTML

Create a prompt that can be used with an input of the content

This is a very silly one with a simple question and a very simple prompt

In [17]:
from langchain import PromptTemplate

from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

# Example 1 copied from https://predictivehacks.com/get-started-with-langchain-prompt-templates/

system_template = "You are a front-end developer expert that knows about css, javascript and html. You are presented an input hmtml and the user will make questions about it"
system_message_prompt =  SystemMessagePromptTemplate.from_template(system_template)


human_template = "I will give you an entire HTML page content as string. I would like to know how many forms are on it. The content is {content}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)


chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

request = chat_prompt.format_prompt(content=html_form_content).to_messages()

result = llm.invoke(request)

print(result.content)
print(f'in_tk={result.usage_metadata["input_tokens"]} ou_tk={result.usage_metadata["output_tokens"]} t_tk={result.usage_metadata["total_tokens"]}')

The HTML content you provided contains **one** form element. The form is defined by the `<form>` tag and this tag appears only once in the HTML content.
in_tk=930 ou_tk=33 t_tk=963


## Prompt With an input HTML separated by triple backticks

Create a prompt that can be used with an input of the content

The content is clearly specified with backticks.
The system message is reduced.

In [21]:
from langchain import PromptTemplate

from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

# Example 1 copied from https://predictivehacks.com/get-started-with-langchain-prompt-templates/

system_template = "You are a front-end developer expert that knows about css, javascript and html."
system_message_prompt =  SystemMessagePromptTemplate.from_template(system_template)

human_message_template = """
The text delimited by triple backticks represent an html dom page.\
Return me only an integer that represents how many forms the page has. \
Answer 0 if the page has no forms. Do not return anything else. \
```{content}```
"""
human_message_prompt = HumanMessagePromptTemplate.from_template(human_message_template)

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

request = chat_prompt.format_prompt(content=html_form_content).to_messages()

result = llm.invoke(request)

print('Example with forms')
print(result.content)
print(f'in_tk={result.usage_metadata["input_tokens"]} ou_tk={result.usage_metadata["output_tokens"]} t_tk={result.usage_metadata["total_tokens"]}')

print('Example with no forms')
request = chat_prompt.format_prompt(content=html_no_forms_content).to_messages()

result = llm.invoke(request)
print(result.content)
print(f'in_tk={result.usage_metadata["input_tokens"]} ou_tk={result.usage_metadata["output_tokens"]} t_tk={result.usage_metadata["total_tokens"]}')


Example with forms
1
in_tk=934 ou_tk=1 t_tk=935
Example with no forms
0
in_tk=7393 ou_tk=1 t_tk=7394
