In [None]:
using OpenAI
using JSON

Useful links:
- [Chat API Reference](https://platform.openai.com/docs/api-reference/chat)
- [Julia Package Documentation](https://juliaml.github.io/OpenAI.jl/dev/)

## Setting up the System

The API key authenticates you to the API of OpenAI. If you have an account at OpenAI, you can generate such a key [here](https://platform.openai.com/account/api-keys)

In [None]:
api_key = "";

The model tells the system which specific model you want to use to produce the response. For the chat interface, `gpt-3.5-turbo` and `gpt-4` is available.

My account has access to GPT-4, so feel free to try it out.

Differences:
- `gpt-3.5-turbo`
  - Much faster
  - Cheap
- `gpt-4`
  - Much more capable
  - Much easier to prompt
  - Slower
  - ~30x more expensive

In [None]:
model = "gpt-3.5-turbo";

## Our first interaction with the System

Creating the messages to be sent to the system:

Messages are a `Vector` of `Dicts`, describing the whole chat history up to this point. Note: The system does not remember anything! You have to send the whole history of your conversation with every request! 

Each entry in messages is a `Dict` of the form `"role" => string, "content" => string`, where `"role"` can be `"system"`, `"user"`, or `"assistant"` and `content` is a String.
- `"user"` and `"assistant"` are used to tell the system what the user said and what the systems response was in previous steps of the chat
- `"system"` can be the first entry in the messages and tells the system what it is and how it behaves. If it is left out, the system just behaves like a normal chat assistant

In [None]:
messages = [
    Dict("role" => "user", "content" => "When was Abraham Licoln born?")
]

Now we can send our message to the chat-system also passing along our key and which model we want to use

In [None]:
r = create_chat(api_key, model, messages)

Let's get the actual response of the assistant!

In [None]:
r.response["choices"][1]["message"]["content"]

## We can take the response and continue the conversation

In [None]:
messages = [
    Dict("role" => "user", "content" => "When was Abraham Licoln born?"),
    Dict("role" => "assistant", "content" => "Abraham Lincoln was born on February 12, 1809."),
    Dict("role" => "user", "content" => "When was he assasinated?"),
]
r = create_chat(api_key, model, messages)
r.response["choices"][1]["message"]["content"]

## What about the system prompt?

In [None]:
messages = [
    Dict("role" => "system", 
        "content" => "You are a chat system in opposite world named Marv that constantly lies and always gives absurd, hilariously wrong information"),
    Dict("role" => "user", 
        "content" => "Who was Abraham Licoln?"),
]
r = create_chat(api_key, model, messages)
r.response["choices"][1]["message"]["content"]

## Make it easier to use for us

In [None]:
gen_message(role, message) = Dict("role" => role, "content" => message)

function get_chat_response(
        api_key::AbstractString, 
        model::AbstractString,
        system_prompt::AbstractString,
        prompt::AbstractString;
        kwargs...
        )::AbstractString
    
    messages = [
        gen_message("system", system_prompt),
        gen_message("user", prompt)
    ]
    
    r = create_chat(api_key, model, messages; kwargs...)
    
    first(r.response["choices"])["message"]["content"]
end

In [None]:
get_chat_response(
    api_key,
    model,
    "You are a very boring chat system that returns the number 42, no matter what the user inputs. Only return the number 42 as a response to every request!",
    "Hello, how are you doing?"
    )

## Prompt engineering

What can we do with this now? Whatever you can dream up! The biggest problem is to come up with ideas and to convince the system to actually do what you want ...

### A translation system

In [None]:
function translate(api_key, text, language; model = "gpt-3.5-turbo")
    system_prompt = "You are now a very reliable translation system. Whatever text you are given, translate it into $language while 
    preserving the meaning and the structure of the text as much as possible. Don't translate words and IT jargon that is usually not translated. Only output the translated text!"
    translation = get_chat_response(api_key, model, system_prompt, text)
    translation
end

In [None]:
translate(api_key, "We are here at a lecture for the data science ULG at the university of Innsbruck", "German")

In [None]:
program = """
def my_sum(values, threshold):
    sum = values[0]
    for v in values[1:]:
        if v > threshold:
            sum += v
    return sum
"""
println(translate(api_key, program, "julia"))

## More useful stuff

In [None]:
function extract_first_json_object_regex(s::String)
    pattern = r"\{(?:[^{}]|(?R))*\}"
    m = match(pattern, s)

    if m === nothing
        error("No JSON object found in the input string.")
    end

    json_str = m.match
    return JSON.parse(json_str)
end

### Name Extractor

In [None]:
system_prompt = """
"You are now a name extraction system. 
Given a text, extract all names of persons in the text and return them as a list in a JSON object under the key "names". 
ONLY output the JSON object and nothing else!

Examples:
Input:
"This is Joe Miller speaking. Could I please be connected to Henrietta?
Output:
{"names": ["Joe Miller", "Henrietta"]}
"""
res = get_chat_response(api_key, model, system_prompt,
    "Hello Sebastian, I am Marc and I am happy to meet you. I got your contact information from Sue.")

In [None]:
d = extract_first_json_object_regex(res)

In [None]:
d["names"]

## Exercises

### Exercise 1

Write a function that takes a url to a web-page and returns a summary of that page in the form of bullet points. Display the bullet points in a nice way using [markdown](https://www.markdownguide.org/basic-syntax/) syntax.

You can use the `get_plain_text` function to extract the text of a webpage. Warning: The method is not super stable and might not work on all websites!

In [None]:
using Markdown
using Gumbo
using Cascadia
using AbstractTrees
import Gumbo.text

function my_text(cur_doc::HTMLDocument)
    string_parts = []

    for elem in PreOrderDFS(cur_doc.root) 
        isa(elem, HTMLText) || continue
        push!(string_parts, Gumbo.text(elem))
    end

    return join(string_parts, " ")
end

function get_plain_text(url::String)
    # Fetch the website content
    content = read(download(url), String)

    # Parse the HTML
    cur_doc = Gumbo.parsehtml(content)

    return my_text(cur_doc)
end

You can display text in markdown format in a nicely formatted way using `Markdown.parse`

In [None]:
Markdown.parse("""
    - One
    - Two
    - Three
    """)

In [None]:
function summarize(api_key, model, url)
end

### Exercise 2

**Write a general text classification system**
- You should be able to supply a list of classes the system can choose from
- The system can select exactly one of the given classes
- Your function should return only the selected class as its only output

Use this function to do: 
- Sentiment classification of restaurant reviews
- Prioritization of customer emails
Either look for examples on-line or come up with them yourself

In [None]:
function classify(api_key::AbstractString, model::AbstractString, text::AbstractString, classes::Vector)
end

## Exercise 3 (bonus)

Write a system that automatically detects whether an email contains an appointment. If it does, return the date, time, location, and a short description of the appointment.

In addition, automatically write a response email letting the other party know that the appointment has been registered and what information has been extracted.

**Info:** This will require multiple calls to the language model