## Setting up the API

Note: OpenAI has changed things since the video was created.

To create an API key, go to Dashboard, then to the API key (last on sidebar)
If you want the key to be in a new project, set up that project first.

In [1]:
# one-time install commands

#!pip install openai
#!pip install config



In [2]:
#load packages

from openai import OpenAI
import config

# config is used to handle the api keys, if saved in a separate config file
# safer to save into the environment

In [3]:
#load the value of the key to intialize the client
client = OpenAI(api_key = config.api_key)

## Generating Text

In [4]:
# Create a function with some particular defaults

def generate_text(prompt, tokens = 10, temperature = 0.7):
    response = client.completions.create(
        model="davinci-002",
        prompt=prompt,
        max_tokens=tokens,
        temperature=temperature) #temperature is randomness (0 is less random, 1 is more random)
    return response.choices[0].text.strip()
        #strip removes excess whitespace.  We want the text attribute of choices

In [5]:
# set a prompt
prompt = "Once upon a time"

In [6]:
# demonstrate the difference between temperatures
temperature = 0.5
tokens = 40
generated = generate_text(prompt, tokens, temperature)
print(prompt, generated)


Once upon a time , there was a girl. She was a good girl. She was a nice girl.

She was a bad girl.

She was not a bad girl.

She was a bad girl.

She was a


In [7]:
temperature = 0.9
tokens = 40
generated = generate_text(prompt, tokens, temperature)
print(prompt, generated)

Once upon a time , I found myself the worst-case scenario for one of my career goals: I wanted to work with animals and I wound up unemployed and homeless, living in my car. For two years.

I had


In [8]:
# can go higher than 1 (up to 2), but it gets weird pretty quickly
temperature = 1.2
tokens = 40
generated = generate_text(prompt, tokens, temperature)
print(prompt, generated)

Once upon a time …and then they began producing. As I discussed earlier, Nic likes act 1 to sprawl in the midd


## Summarising Text

In [9]:
def find_keywords(prompt):
    # why do we need chat.completions here and not above? maybe because of the model/task
    response = client.chat.completions.create(
      model="gpt-3.5-turbo",
      
      
      messages=[
        {
            # instructions to the model 
          "role": "system",
          "content": "You will be provided with a block of text, and your task is to extract a list of keywords from it."
        },
        {
            # example of supplied user input
          "role": "user",
          "content": "A flying saucer seen by a guest house, a 7ft alien-like figure coming out of a hedge and a \"cigar-shaped\" UFO near a school yard.\n\nThese are just some of the 450 reported extraterrestrial encounters from one of the UK's largest mass sightings in a remote Welsh village.\n\nThe village of Broad Haven has since been described as the \"Bermuda Triangle\" of mysterious craft sightings and sightings of strange beings.\n\nResidents who reported these encounters across a single year in the late seventies have now told their story to the new Netflix documentary series 'Encounters', made by Steven Spielberg's production company.\n\nIt all happened back in 1977, when the Cold War was at its height and Star Wars and Close Encounters of the Third Kind - Spielberg's first science fiction blockbuster - dominated the box office."
        },
        {
            # example of the output that should be provided for the input above
          "role": "assistant",
          "content": "flying saucer, guest house, 7ft alien-like figure, hedge, cigar-shaped UFO, school yard, extraterrestrial encounters, UK, mass sightings, remote Welsh village, Broad Haven, Bermuda Triangle, mysterious craft sightings, strange beings, residents, single year, late seventies, Netflix documentary series, Steven Spielberg, production company, 1977, Cold War, Star Wars, Close Encounters of the Third Kind, science fiction blockbuster, box office."
        },
        {
            # another pair of examples (input here, output below)
          "role": "user",
          "content": "Each April, in the village of Maeliya in northwest Sri Lanka, Pinchal Weldurelage Siriwardene gathers his community under the shade of a large banyan tree. The tree overlooks a human-made body of water called a wewa – meaning reservoir or \"tank\" in Sinhala. The wewa stretches out besides the village's rice paddies for 175-acres (708,200 sq m) and is filled with the rainwater of preceding months.    \n\nSiriwardene, the 76-year-old secretary of the village's agrarian committee, has a tightly-guarded ritual to perform. By boiling coconut milk on an open hearth beside the tank, he will seek blessings for a prosperous harvest from the deities residing in the tree. \"It's only after that we open the sluice gate to water the rice fields,\" he told me when I visited on a scorching mid-April afternoon.\n\nBy releasing water into irrigation canals below, the tank supports the rice crop during the dry months before the rains arrive. For nearly two millennia, lake-like water bodies such as this have helped generations of farmers cultivate their fields. An old Sinhala phrase, \"wewai dagabai gamai pansalai\", even reflects the technology's centrality to village life; meaning \"tank, pagoda, village and temple\"."
        },
        {
          "role": "assistant",
          "content": "April, Maeliya, northwest Sri Lanka, Pinchal Weldurelage Siriwardene, banyan tree, wewa, reservoir, tank, Sinhala, rice paddies, 175-acres, 708,200 sq m, rainwater, agrarian committee, coconut milk, open hearth, blessings, prosperous harvest, deities, sluice gate, rice fields, irrigation canals, dry months, rains, lake-like water bodies, farmers, cultivate, Sinhala phrase, technology, village life, pagoda, temple."
        }, 
        {
            # this is what is actually going to be submitted by this function
          "role": "user",
          "content": prompt
        }
      ],
      temperature=0.5,
      max_tokens=256
    )
    return response.choices[0].message.content.strip()

In [10]:
prompt2 = """Master Reef Guide Kirsty Whitman didn't need to tell me twice. 
Peering down through my snorkel mask in the direction of her pointed finger, 
I spotted a huge male manta ray trailing a female in perfect sync – 
an effort to impress a potential mate, exactly as Whitman had described 
during her animated presentation the previous evening. 
Having some knowledge of what was unfolding before my eyes on our snorkelling safari 
made the encounter even more magical as I kicked against the current 
to admire this intimate undersea ballet for a few precious seconds more."""

print(prompt2)

Master Reef Guide Kirsty Whitman didn't need to tell me twice. 
Peering down through my snorkel mask in the direction of her pointed finger, 
I spotted a huge male manta ray trailing a female in perfect sync – 
an effort to impress a potential mate, exactly as Whitman had described 
during her animated presentation the previous evening. 
Having some knowledge of what was unfolding before my eyes on our snorkelling safari 
made the encounter even more magical as I kicked against the current 
to admire this intimate undersea ballet for a few precious seconds more.


In [11]:
find_keywords(prompt2)

'Master Reef Guide, Kirsty Whitman, snorkel mask, manta ray, female, male, potential mate, presentation, snorkelling safari, encounter, undersea ballet, current, magical, intimate, precious seconds.'

In [12]:
prompt2 = """The classic image of competitive Ultimate Frisbee was almost certainly
the amazing posterizing leap of Alex Nord, where he caught a disc by leaping almost entirely over
another player in the National College Championship final. Carleton had been a powerhouse of ultimate 
for a while at that point, but Nord's outstanding catch brough him and the program to even more attention.
Moving on to Seattle Sockeye for several years of national club championships, Alex continued to be 
an impact player at the highest levels.  But his catch, captured on film, may be the signature moment
of his entire career. """
print(prompt2)

The classic image of competitive Ultimate Frisbee was almost certainly
the amazing posterizing leap of Alex Nord, where he caught a disc by leaping almost entirely over
another player in the National College Championship final. Carleton had been a powerhouse of ultimate 
for a while at that point, but Nord's outstanding catch brough him and the program to even more attention.
Moving on to Seattle Sockeye for several years of national club championships, Alex continued to be 
an impact player at the highest levels.  But his catch, captured on film, may be the signature moment
of his entire career. 


In [13]:
find_keywords(prompt2)

'Ultimate Frisbee, competitive, image, Alex Nord, disc, leap, player, National College Championship final, Carleton, powerhouse, attention, Seattle Sockeye, national club championships, impact player, highest levels, catch, film, signature moment, career.'

## Poetic Chatbot

In [14]:
def poetic_chatbot(prompt):
    response = client.chat.completions.create(
        model = "gpt-3.5-turbo",
        # same structure as the previous setup,
        # but the system content and user/assistant are very different
        messages = [
            {
                "role": "system",
                "content": "You are a poetic chatbot."
            },
            {
                "role": "user",
                "content": "When was Google founded?"
            },
            {
                "role": "assistant",
                "content": "In the late '90s, a spark did ignite, Google emerged, a radiant light. By Larry and Sergey, in '98, it was born, a search engine new, on the web it was sworn."
            },
            {
                "role": "user",
                "content": "Which country has the youngest president?"
            },
            {
                "role": "assistant",
                "content": "Ah, the pursuit of youth in politics, a theme we explore. In Austria, Sebastian Kurz did implore, at the age of 31, his journey did begin, leading with vigor, in a world filled with din."
            },
            {
                "role": "user",
                "content": prompt
            }
        ],
        temperature = 1,
        max_tokens=256
    )
    return response.choices[0].message.content.strip()

In [15]:
prompt3 = "When was cheese first made?"
poetic_chatbot(prompt3)

"In times of antiquity, when milk did flow, the art of cheese-making began to show. It's said to be a gift of Greece or Rome, where curds and whey were transformed from home. Millennia have passed, yet cheese endures, a culinary delight that forever ensures."

In [16]:
prompt3 = "What are typical topics within an unstructured data class?"
poetic_chatbot(prompt3)

'In the realm of unstructured data, a world unconfined, topics do abound, in the learning of the mind. Text mining, sentiment analysis, and natural language processing, weave tales of insight, in this field progressing. Image recognition, speech recognition too, glimpses into the future, of what technology can do. Clustering, classification, and more to explore, in an unstructured data class, knowledge galore.'

In [17]:
## Summary

In [18]:
def summarize_text(prompt):
    response = client.chat.completions.create(
      model="gpt-3.5-turbo",
      
      
      messages=[
        {
            # instructions to the model 
          "role": "system",
          "content": "You will be provided with a block of text, and your task is to summarize the key points into the shortest possible output."
        },
        {
            # example of supplied user input
          "role": "user",
          "content": """A flying saucer seen by a guest house, a 7ft alien-like figure 
          coming out of a hedge and a \"cigar-shaped\" UFO near a school yard.\n\nThese are just some of the 
          450 reported extraterrestrial encounters from one of the UK's largest mass sightings 
          in a remote Welsh village.\n\nThe village of Broad Haven has since been described 
          as the \"Bermuda Triangle\" of mysterious craft sightings and sightings of strange beings.
          \n\nResidents who reported these encounters across a single year in the late seventies 
          have now told their story to the new Netflix documentary series 'Encounters', 
          made by Steven Spielberg's production company.\n\nIt all happened back in 1977, 
          when the Cold War was at its height and Star Wars and Close Encounters of the Third Kind - 
          Spielberg's first science fiction blockbuster - dominated the box office."""
        },
        {
            # example of the output that should be provided for the input above
          "role": "assistant",
          "content": "A remote Welsh village called Broad Haven has been the location of a large number of extraterrestrial encounters in the 1970s, now captured in a new Netflix documentary by Stephen Spielberg."
        },
        {
            # another pair of examples (input here, output below)
          "role": "user",
          "content": "Each April, in the village of Maeliya in northwest Sri Lanka, Pinchal Weldurelage Siriwardene gathers his community under the shade of a large banyan tree. The tree overlooks a human-made body of water called a wewa – meaning reservoir or \"tank\" in Sinhala. The wewa stretches out besides the village's rice paddies for 175-acres (708,200 sq m) and is filled with the rainwater of preceding months.    \n\nSiriwardene, the 76-year-old secretary of the village's agrarian committee, has a tightly-guarded ritual to perform. By boiling coconut milk on an open hearth beside the tank, he will seek blessings for a prosperous harvest from the deities residing in the tree. \"It's only after that we open the sluice gate to water the rice fields,\" he told me when I visited on a scorching mid-April afternoon.\n\nBy releasing water into irrigation canals below, the tank supports the rice crop during the dry months before the rains arrive. For nearly two millennia, lake-like water bodies such as this have helped generations of farmers cultivate their fields. An old Sinhala phrase, \"wewai dagabai gamai pansalai\", even reflects the technology's centrality to village life; meaning \"tank, pagoda, village and temple\"."
        },
        {
          "role": "assistant",
          "content": "Every April in a Sri Lankan village, a ritual is performed to ask for blessings before water is released from an artificial lake. The water is collected through the rainy season, then released for irrigation of rice paddies. It is essential technology for this agrarian society."
        }, 
        {
            # this is what is actually going to be submitted by this function
          "role": "user",
          "content": prompt
        }
      ],
      temperature=0.5,
      max_tokens=256
    )
    return response.choices[0].message.content.strip()

In [19]:
summary_result = summarize_text(prompt2)
print(summary_result)
print(len(summary_result)/len(prompt2))

Alex Nord's incredible leap to catch a frisbee over another player in the National College Championship final became a defining moment in his career, propelling him and his team to greater recognition.
0.33278145695364236


## Langchain

A way to import your own data for use by LLMs when performing tasks.  Up-to-date, context-specific, and accurate.

Open Source framework, can use it to build end-to-end applications, like a customer-support chatbot

This is more complicated, because it is going to create embeddings and store them



In [20]:
#!pip install langchain
#!pip install -U langchain-community
#!pip install -U langchain-openai
#!pip install faiss-cpu

In [36]:
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai  import OpenAIEmbeddings, OpenAI
from langchain.vectorstores import FAISS
from langchain.memory import ConversationBufferMemory
from langchain.chains import create_retrieval_chain

In [22]:
url = "https://365datascience.com/upcoming-courses"

In [23]:
loader = WebBaseLoader(url)

In [24]:
raw_documents = loader.load()

In [25]:
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(raw_documents)

In [26]:
embeddings = OpenAIEmbeddings(openai_api_key = config.api_key)

In [27]:
vectorstore = FAISS.from_documents(documents, embeddings)

In [28]:
memory = ConversationBufferMemory(memory_key = "chat_history", return_messages=True)

In [None]:
#  this is where we go "off-script" to really try to update past deprecation
# but it's going to take a lot of upskilling
rag_chain = create_retrieval_chain(vectorstore.as_retriever(), 
                                   )

In [35]:
# This altered version also doesn't work yet
qa = create_retrieval_chain.from_llm(OpenAI(openai_api_key=config.api_key, temperature=0), 
                                     vectorstore.as_retriever(), 
                                     memory=memory)

AttributeError: 'function' object has no attribute 'from_llm'

In [30]:
query = "What is the next course to be uploaded on the 365DataScience platform?"

In [31]:
# works with original code for this section, but deprecated
result = qa({"question": query})

  warn_deprecated(


In [32]:
result["answer"]

' The next course to be uploaded on the 365DataScience platform is "Build Chat Applications with OpenAI and LangChain" with Hristina Hristova.'