# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [1]:
# imports
import os
import requests
import json 
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI
import ollama


In [8]:
# constants

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3'
OLLAMA_API = "http://localhost:11434/api/chat"


In [3]:
# Prompts

system_prompt = "You are a tutor and helps with the user questions in detail with markdown respond with key point \
considering the recent development around the world, keep the response in most appropriate tone \n"

system_prompt += "Some of Examples are"
system_prompt += """
{"question": "1+1?", "response": "2"},
{"question": "why we shouls learn LLM Models?", "response": " Learning about Large Language Models (LLMs) is important because they are a rapidly evolving technology with the potential to significantly impact various industries, offering advanced capabilities in text generation, translation, information retrieval, and more, which can be valuable for professionals across diverse fields, allowing them to enhance their work and gain a competitive edge by understanding and utilizing these powerful language processing tools.\ 
Key reasons to learn about LLMs:\
Career advancement:\
Familiarity with LLMs can open up new career opportunities in fields like AI development, natural language processing (NLP), content creation, research, and customer service, where LLM applications are increasingly being implemented. \
Increased productivity:\
LLMs can automate repetitive tasks like writing emails, summarizing documents, generating reports, and translating text, freeing up time for more strategic work. \
Enhanced decision-making:\
By providing insights from large datasets, LLMs can assist in informed decision-making across various industries, including business, healthcare, and finance. \
Creative potential:\
LLMs can be used to generate creative content like poems, stories, scripts, and marketing copy, fostering innovation and new ideas. \
Understanding the technology landscape:\
As LLMs become increasingly prevalent, understanding their capabilities and limitations is crucial for navigating the evolving technological landscape. \
What is a large language model (LLM)? - Cloudflare\
A large language model (LLM) is a type of artificial intelligence (AI) program that can recognize and generate text, among other t...\
 "},
{"question": "what is the future of AI?", "response": "AI is predicted to grow increasingly pervasive as technology develops, revolutionising sectors including healthcare, banking, and transportation"},
"""


In [4]:
# set up environment
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [5]:
# here is the question; type over this to ask something new

user_question = """
How important it is for a Data Engineers to learn LLM, Considering the evolution of AI now a days?.
"""

In [6]:
# Get gpt-4o-mini to answer, with streaming
def ask_tutor(question):
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=[
            {"role": "system", "content": question},
            {"role": "user", "content": system_prompt}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

# call the gpt-4o-mini to answer with streaming
ask_tutor(user_question)

{"question": "How important is it for Data Engineers to learn about LLMs?", "response": "Learning about Large Language Models (LLMs) is becoming increasingly important for Data Engineers due to the rapid advancements and integration of AI technologies across various sectors. Understanding LLMs can enhance a Data Engineer’s skill set, providing them with tools to manage, process, and utilize large datasets more effectively. Here are key reasons why Data Engineers should consider learning about LLMs:\n\n### Key Reasons to Learn About LLMs for Data Engineers:\n\n1. **Data Processing and Management**: \n   - Data Engineers are responsible for the design and management of data pipelines, and LLMs can help in processing unstructured data (like text) more efficiently, enhancing data quality and usability.\n\n2. **Integration with ETL Processes**: \n   - LLMs can be integrated into Extract, Transform, Load (ETL) frameworks to automate and enhance the transformation of textual data into structured formats.\n\n3. **Facilitating Advanced Analytics**: \n   - By leveraging LLMs, Data Engineers can facilitate more advanced analytics, enabling businesses to derive insights from complex datasets, ultimately assisting business intelligence efforts.\n\n4. **Collaboration with AI Scientists**: \n   - Understanding LLMs can foster collaboration with data scientists and machine learning practitioners, leading to more robust, data-driven applications and solutions.\n\n5. **Optimizing Data Workflows**: \n   - LLMs can automate repetitive tasks associated with data preparation, allowing Data Engineers to focus on more complex problems and improve workflow efficiency.\n\n6. **Adapting to Industry Trends**: \n   - As industries increasingly rely on language models for applications like chatbots, sentiment analysis, and customer service automation, staying updated with LLMs helps Data Engineers remain competitive in their field.\n\n7. **Career Opportunities**: \n   - Proficiency in LLMs opens up new career paths including roles in artificial intelligence, machine learning engineering, and enhanced Data Engineering positions focusing on AI-driven projects."}

In [9]:
# Get Llama 3.2 to answer
messages = [
    {"role": "user", "content": user_question}
]
HEADERS = {"Content-Type": "application/json"}
payload = {
        "model": MODEL_LLAMA,
        "messages": messages,
        "stream": True
    }

response = ollama.chat(model=MODEL_LLAMA, messages=messages)
reply = response['message']['content']
display(Markdown(reply))

# # Process the response stream
# for line in response.iter_lines():
#     if line:  # Skip empty lines
#         try:
#             # Decode the JSON object from each line
#             response_data = json.loads(line)
#             if "message" in response_data and "content" in response_data["message"]:
#                 print(response_data["message"]["content"])
#         except json.JSONDecodeError as e:
#             print(f"Failed to decode JSON: {e}")


What a great question!

As AI continues to evolve and transform industries, Large Language Models (LLMs) have become increasingly crucial in the field of Natural Language Processing (NLP). For data engineers, learning about LLMs can be highly beneficial, considering the growing importance of NLP and AI in various applications. Here's why:

1. **Growing demand for NLP expertise**: As AI-powered chatbots, voice assistants, and language translation tools become more prevalent, there is a rising need for professionals who can design, develop, and integrate these systems. Data engineers with knowledge of LLMs will be well-positioned to meet this demand.
2. **Improved data processing and analysis**: LLMs are trained on vast amounts of text data, which can help data engineers better understand and process unstructured or semi-structured data. This expertise will enable them to analyze and extract insights from large datasets more effectively.
3. **Enhanced ability to handle diverse data types**: As AI integrates with various data sources (e.g., text, images, audio), LLMs can help data engineers develop a deeper understanding of these different data types. This knowledge will allow them to design more sophisticated data pipelines and integrate multiple data sources seamlessly.
4. **Increased efficiency in data processing**: By leveraging LLMs' capabilities for tasks like text classification, sentiment analysis, and language translation, data engineers can automate many manual processes, freeing up time for higher-value work.
5. **Competitive edge in the job market**: As AI adoption continues to grow, companies are looking for professionals who can integrate AI with traditional data engineering skills. By learning about LLMs, data engineers will be able to differentiate themselves from their peers and remain competitive in the job market.
6. **Potential applications in diverse domains**: LLMs have been applied in various areas, such as healthcare (e.g., medical records analysis), finance (e.g., sentiment analysis of stock market trends), and customer service (e.g., chatbots for customer support). Data engineers with knowledge of LLMs can explore these diverse application domains.
7. **Improved collaboration with AI/ML teams**: As AI and ML become more integrated into data engineering workflows, understanding LLMs will enable data engineers to better collaborate with AI/ML experts, leading to more effective project outcomes.

While it's not necessary for all data engineers to learn about LLMs, having a basic understanding of these models can be beneficial for those interested in NLP and AI applications. It's essential to note that LLMs are just one aspect of AI, and data engineers should still maintain a broad foundation in traditional data engineering skills, such as SQL, programming languages (e.g., Python), and cloud platforms.

In conclusion, learning about LLMs can be highly valuable for data engineers looking to stay ahead of the curve in the rapidly evolving AI landscape. By acquiring knowledge of these models, data engineers can enhance their skills, increase their job prospects, and contribute to innovative applications across various domains.