Streamlit + Readme update: copy to cURL (#22)

Skyvern-AI · Mar 4, 2024 · 3bf5671 · 3bf5671
1 parent 0495552
commit 3bf5671
Show file tree

Hide file tree

Showing 6 changed files with 79 additions and 34 deletions.
diff --git a/README.md b/README.md
@@ -27,8 +27,35 @@
   <img src="images/geico_shu_recording_cropped.gif"/>
 </p>
 
-Want to see more examples of Skyvern in action? Click [here](#real-world-examples-of-skyvern)!
+Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
 
+Instead of only relying on code-defined XPath interactions, Skyvern adds computer vision and LLMs to the mix to parse items in the viewport in real-time, create a plan for interaction and interact with them.
+
+This approach gives us a few advantages:
+
+1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
+1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
+1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
+    1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
+    1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
+
+
+Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyvern](#real-world-examples-of-skyvern)
+
+
+# How it works
+Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
+
+<picture>
+  <source media="(prefers-color-scheme: dark)" srcset="images/skyvern-system-diagram-dark.png" />
+  <img src="images/skyvern-system-diagram-light.png" />
+</picture>
+
+<!-- TODO (suchintan): 
+Expand the diagram above to go deeper into how:
+1. We draw bounding boxes
+2. We parse the HTML + extract the image to generate an interactable element map
+-->
 
 # Quickstart
 This quickstart guide will walk you through getting Skyvern up and running on your local machine. 
@@ -72,20 +99,26 @@ pre-commit install
 
 ## Running your first automation
 
+### Executing tasks (UI)
+Once you have the UI running, you can start an automation by filling out the fields shown in the UI and clicking "Execute" 
 
-# How it works
-Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major difference: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
+<p align="center">
+  <img src="images/skyvern_visualizer_run_task.png"/>
+</p>
 
-<picture>
-  <source media="(prefers-color-scheme: dark)" srcset="images/skyvern-system-diagram-dark.png"/>
-  <img src="images/skyvern-system-diagram-light.png"/>
-</picture>
+### Executing tasks (cURL)
+
+```
+curl -X POST -H 'Content-Type: application/json' -H 'x-api-key: {Your local API key}' -d '{
+    "url": "https://www.geico.com",
+    "webhook_callback_url": "",
+    "navigation_goal": "Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved",
+    "data_extraction_goal": "Extract all quote information in JSON format including the premium amount, the timeframe for the quote.",
+    "navigation_payload": "{Your data here}",
+    "proxy_location": "NONE"
+}' http://0.0.0.0:8000/api/v1/tasks
+```
 
-<!-- > TODO (suchintan): 
-Expand the diagram above to go deeper into how:
-1. We draw bounding boxes
-2. We parse the HTML + extract the image to generate an interactable element map
--->
 
 # Real-world examples of Skyvern
 <!-- > TODO (suchintan):
@@ -123,18 +156,6 @@ More extensive documentation can be found on our [documentation website](https:/
 
 Our focus is bringing stability to browser-based workflows. We leverage LLMs to create an AI Agent capable of interacting with websites like you or I would — all via a simple API call.
 
-Traditional approaches required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
-
-Skyvern operates like a human — increasing reliability by not relying on fragile scripts, instead relying on computer vision to parse items in the viewport and interact with them the way a human would.
-
-This approach gives us a few advantages:
-
-1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
-1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
-1. Skyvern is able to circumvent or navigate through many bot detection methods as many of them rely on allowing people to access the websites
-1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
-    1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
-    1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
 
 
 # Feature Roadmap

diff --git a/images/skyvern_visualizer_run_task.png b/images/skyvern_visualizer_run_task.png
diff --git a/pyproject.toml b/pyproject.toml
@@ -43,6 +43,8 @@ asyncache = "^0.3.1"
 orjson = "^3.9.10"
 structlog = "^23.2.0"
 plotly = "^5.18.0"
+clipboard = "^0.0.4"
+curlify = "^2.2.1"
 
 
 [tool.poetry.group.dev.dependencies]
@@ -66,6 +68,7 @@ notebook = "^7.0.6"
 freezegun = "^1.2.2"
 snoop = "^0.4.3"
 rich = {extras = ["jupyter"], version = "^13.7.0"}
+clipboard = "^0.0.4"
 
 
 [build-system]

diff --git a/streamlit_app/visualizer/api.py b/streamlit_app/visualizer/api.py
@@ -1,7 +1,9 @@
 import json
 from typing import Any
 
+import curlify
 import requests
+from requests import PreparedRequest
 
 from skyvern.forge.sdk.schemas.tasks import TaskRequest
 
@@ -11,19 +13,31 @@ def __init__(self, base_url: str, credentials: str):
         self.base_url = base_url
         self.credentials = credentials
 
-    def create_task(self, task_request_body: TaskRequest) -> str | None:
+    def generate_curl_params(self, task_request_body: TaskRequest) -> PreparedRequest:
         url = f"{self.base_url}/tasks"
         payload = task_request_body.model_dump()
         headers = {
             "Content-Type": "application/json",
             "x-api-key": self.credentials,
         }
 
+        return url, payload, headers
+
+    def create_task(self, task_request_body: TaskRequest) -> str | None:
+        url, payload, headers = self.generate_curl_params(task_request_body)
+
         response = requests.post(url, headers=headers, data=json.dumps(payload))
         if "task_id" not in response.json():
             return None
         return response.json()["task_id"]
 
+    def copy_curl(self, task_request_body: TaskRequest) -> str:
+        url, payload, headers = self.generate_curl_params(task_request_body)
+
+        req = requests.Request("POST", url, headers=headers, data=json.dumps(payload, indent=4))
+
+        return curlify.to_curl(req.prepare())
+
     def get_task(self, task_id: str) -> dict[str, Any] | None:
         """Get a task by id."""
         url = f"{self.base_url}/internal/tasks/{task_id}"

diff --git a/streamlit_app/visualizer/sample_data.py b/streamlit_app/visualizer/sample_data.py
@@ -1,16 +1,11 @@
-from pydantic import BaseModel
+from skyvern.forge.sdk.schemas.tasks import TaskRequest
 
 
-class SampleData(BaseModel):
+class SampleTaskRequest(TaskRequest):
     name: str
-    url: str
-    navigation_goal: str
-    data_extraction_goal: str
-    navigation_payload: dict
-    extracted_information_schema: dict
 
 
-geico_sample_data = SampleData(
+geico_sample_data = SampleTaskRequest(
     name="Geico",
     url="https://www.geico.com",
     navigation_goal="Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved",

diff --git a/streamlit_app/visualizer/streamlit.py b/streamlit_app/visualizer/streamlit.py
@@ -1,3 +1,4 @@
+import clipboard
 import pandas as pd
 import streamlit as st
 
@@ -104,15 +105,26 @@ def select_step(step: dict) -> None:
 st.markdown(f"### **{select_env} - {select_org}**")
 execute_tab, visualizer_tab = st.tabs(["Execute", "Visualizer"])
 
+
+def copy_curl_to_clipboard(task_request_body: TaskRequest) -> None:
+    clipboard.copy(client.copy_curl(task_request_body=task_request_body))
+
+
 with execute_tab:
     example_tabs = st.tabs([supported_example.name for supported_example in supported_examples])
 
     for i, example_tab in enumerate(example_tabs):
         with example_tab:
             create_column, explanation_column = st.columns([1, 2])
             with create_column:
+                run_task, copy_curl = st.columns([3, 1])
+                task_request_body = supported_examples[i]
+                copy_curl.button(
+                    "Copy cURL", on_click=lambda: copy_curl_to_clipboard(task_request_body=task_request_body)
+                )
                 with st.form("task_form"):
-                    st.markdown("## Run a task")
+                    run_task.markdown("## Run a task")
+
                     example = supported_examples[i]
                     # Create all the fields to create a TaskRequest object
                     st_url = st.text_input("URL*", value=example.url, key="url")