## Robotic Process Automation with Generative AI

<b>By:</b> <a href='https://www.evolveailabs.com/'>Evolve AI Labs</a><br>

This notebook aims to illustrate how to use Generative AI models; specifically <a href='https://en.wikipedia.org/wiki/Large_language_model'>LLMs</a> for <a href='https://en.wikipedia.org/wiki/Robotic_process_automation'>Robotic Process Automation</a>. <br> <br>
Robotic Process Automation is a form of automating business processes using software programs. These programs can include simple logical constructs with graphs or can include complicated Artificial Intelligence models. One of the  strengths of LLMs is translation and this can be leveraged for certain RPA tasks like Web Scraping as shown in the code below. <br> <br>

### Advantages
Generative AI can augment the benefits of RPA by further
- <b>improving productivity:</b> build RPA workflows faster
- <b>reducing complexity:</b> reduce dependence on SMEs
- <b>reducing costs:</b> simplify workflows which reduce overheads

This notebook shows how to scrape a website using Generative AI. 

## Libraries
- <a href='https://www.selenium.dev/'>Selenium</a>: The most important part of any web scraping exercise is to extract page content. Scraping JavaScript rendered pages is not as easy as Static webpages because they are rendered differently. Selenium is a popular browser automation tool that can help us scrape JS rendered pages.
- <a href='https://www.langchain.com/'>Langchain</a>: Once the page source is available, we will use Langchain API to send the HTML to an LLM. Langchain is a popular and easy to use python tool for building LLM applications.

In [1]:
import chromedriver_autoinstaller
from selenium.webdriver.chrome.options import Options
from selenium.webdriver import Chrome 
from selenium.webdriver.common.by import By 
from selenium import webdriver
import os
import requests
import time
from langchain.chat_models import ChatOpenAI

## LLM 
We will be using the <a href='https://platform.openai.com/docs/models/gpt-3-5'>GPT 3.5</a> Turbo LLM provided by <a href='https://openai.com/'>OpenAI</a> to translate the page source in HTML to the outcome we require. You'll need an openai API key to run this notebook. 

In [2]:
OPENAI_KEY = <<OPENAI KEY>>
MODEL_NAME = "gpt-3.5-turbo-16k" 

## Scraper
In the below function, we show how to extract the page source of a JavaScript rendered website and then send the HTML to the GPT 3.5 LLM and request it to extract information into a list for downstream consumption. Selenium is used to render the page and get the HTML of the page. And the Langchain API is used to send the page source to the LLM and return the output.

In [3]:
def gpt_scraper(url, instructions):
    ts = 10
    chromedriver_autoinstaller.install()
    options = webdriver.ChromeOptions() 
    options.add_argument("--headless")
    options.page_load_strategy = "none"
    driver = Chrome(options=options) 
    driver.implicitly_wait(ts)
    driver.get(url) 
    time.sleep(ts)
    
    prompt = "You are an agent helping scrape data \
    from an HTML page. Using the following HTML, "
    prompt = prompt + instructions + " : " + driver.page_source
    chat = ChatOpenAI(temperature=0, 
                      model=MODEL_NAME, 
                      openai_api_key=OPENAI_KEY)
    response = chat.predict(prompt)
    return response

## Demo
The Raspberry PI website lists a bunch of technical documents that we want to scrape. Instead of writing all the code to extract the HTML of the page and identify lists and find the href links, we can just pass the url and instructions to the above function and get results.

In [4]:
%%time
url = "https://datasheets.raspberrypi.com/?_gl=1*1itvbwk*_ga*MTA0NjY5MTcyOS4xNzAxNjczMTkz*_ga_22FD70LWDS*MTcwMTY3MzE5My4xLjAuMTcwMTY3MzE5My4wLjAuMA.."
instructions = "provide the links to all documents in the page as a list."
response = gpt_scraper(url, instructions)

CPU times: user 486 ms, sys: 154 ms, total: 640 ms
Wall time: 2min 2s


In [5]:
print(response)

- audio/codec-zero-phat-product-brief.pdf
- audio/dac-plus-hat-product-brief.pdf
- audio/dac-pro-hat-product-brief.pdf
- audio/digiamp-plus-hat-product-brief.pdf
- bcm2711/bcm2711-peripherals.pdf
- bcm2835/bcm2835-peripherals.pdf
- bcm2836/bcm2836-peripherals.pdf
- build-hat/build-hat-python-library.pdf
- build-hat/build-hat-serial-protocol.pdf
- build-hat/getting-started-build-hat.pdf
- build-hat/raspberry-pi-build-hat-power-supply-product-brief.pdf
- build-hat/raspberry-pi-build-hat-product-brief.pdf
- camera/camera-extended-spectral-sensitivity.pdf
- camera/camera-module-2-mechanical-drawing.pdf
- camera/camera-module-2-schematics.pdf
- camera/camera-module-3-product-brief.pdf
- camera/camera-module-3-schematics.pdf
- camera/camera-module-3-standard-mechanical-drawing.pdf
- camera/camera-module-3-wide-mechanical-drawing.pdf
- camera/picamera2-manual.pdf
- camera/raspberry-pi-camera-guide.pdf
- camera/raspberry-pi-image-signal-processor-specification.pdf
- case-fan/case-fan-product-b

## Conclusion
Using Generative AI, we were augment RPA staples like Web Scraping with minimal knowledge of DOM in JavaScript. This is how Generative AI and LLMs can help augment the benefits of Robotic Process Automation.