# Examples Making API Calls and Scraping


Applied Examples on AESO Market Data

**Create an API key at https://developer-apim.aeso.ca/apis**

%md
## Built in SQL Function for API Calls

Steps:
1. Create a connection and register in Unity Catalog - done once, not within UDF
2. Make API call using built in `http_request()`

In [0]:
-- register connection in Unity Catalog to aeso API
CREATE OR REPLACE CONNECTION aeso_api_connect
  TYPE HTTP
  OPTIONS (
    host 'https://apimgw.aeso.ca',
    port '443',
    base_path '/public',
    bearer_token ''
  );

In [0]:
-- make api call, replace YOUR-KEY (could live in secret-scope as well)
-- you could wrap this in a SQL UDF as well if you wanted to format the output
SELECT http_request(
  conn => 'aeso_api_connect',
  method => 'GET',
  path => '/currentsupplydemand-api/v2/csd/summary/current',
  json => '',
  headers => map(
    'Cache-Control', 'no-cache',
    'API-KEY', 'YOUR-KEY'
    )
);

## Python UDF to Make API Calls

Since this can be done directly with the `http_request()` built in SQL function it's best to use that function. Create a Python UDF when a certain library or more complex logic is needed
Steps:
1. Define a function that uses the requests library and returns a JSON string

In [0]:
-- you could also define RETURNS TABLE and this would be invoked as SELECT * FROM call_api
CREATE OR REPLACE FUNCTION users.david_hurley.call_api(endpoint STRING, apikey STRING)
RETURNS STRING
LANGUAGE PYTHON
ENVIRONMENT (
  dependencies = '["urllib3==2.5.0", "requests==2.32.5"]',
  environment_version = 'None'
)
AS
$$
import urllib.request, json

hdr ={
'Cache-Control': 'no-cache',
'API-KEY': apikey,
}

req = urllib.request.Request(endpoint, headers=hdr)

req.get_method = lambda: 'GET'
response = urllib.request.urlopen(req)
return response.read()
$$;

In [0]:
-- make api call, replace YOUR-KEY (could live in secret-scope as well)
SELECT users.david_hurley.call_api("https://apimgw.aeso.ca/public/currentsupplydemand-api/v2/csd/summary/current", "YOUR-KEY")

## Python UDF to Scrape HTML

Steps:
1. Define a function that uses the requests library and beautifulsoup library to scrape AESO page

In [0]:
-- you could also define RETURNS TABLE and this would be invoked as SELECT * FROM scrape_url
CREATE OR REPLACE FUNCTION users.david_hurley.scrape_url(url STRING)
RETURNS STRING
LANGUAGE PYTHON
ENVIRONMENT (
  dependencies = '["beautifulsoup4==4.14.2", "requests==2.32.5"]',
  environment_version = 'None'
)
AS
$$
import requests, json
from bs4 import BeautifulSoup

page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

output = {
    "url": url,
    "raw_html": str(soup)
}

return json.dumps(output)
$$;

In [0]:
CREATE OR REPLACE FUNCTION users.david_hurley.scrape_url(url STRING)
RETURNS STRING
LANGUAGE PYTHON
ENVIRONMENT (
  dependencies = '["beautifulsoup4==4.14.2", "requests==2.32.5"]',
  environment_version = 'None'
)
AS
$$
import requests
from bs4 import BeautifulSoup
from xml.etree.ElementTree import Element, SubElement, tostring

page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

root = Element("scrape")
url_elem = SubElement(root, "url")
url_elem.text = url

html_elem = SubElement(root, "raw_html")
html_elem.text = str(soup)

xml_string = tostring(root, encoding='unicode')

return xml_string
$$;


In [0]:
SELECT users.david_hurley.scrape_url("http://ets.aeso.ca/ets_web/ip/Market/Reports/CSDReportServlet")