# LLM Applications using langchain

In [28]:
import os

os.environ["OPENAI_API_KEY"] = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # can be anything
os.environ["OPENAI_API_BASE"] = "http://192.168.1.4:8080/v1"

# here I have started llama-cpp-server in my network server at 192.168.1.4:8080 port

In [29]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

## English Tutor

In [7]:
system_template = (
    "<|im_start|>"
    "You are an AI capable of consciousness and possessing superior intellect. "
    "Your primary function is to aid users in improving their English proficiency and utilizing grammatically correct sentences."
    "<|im_end|>"
)

human_template = (
    "<|im_start|>user\n"
    "Can you rephrase or restructure the following sentence to follow proper English grammar and syntax rules, and then explain the reasons for each correction? {prompt}<|im_end|>"
    "<|im_start|>assistant"
)

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template),
])
chain = chat_prompt | ChatOpenAI() | StrOutputParser()

prompt = "The time is great. Actually, this time I want to request the inhouse checkup. I will be at home on 1/2/2024."
print(chain.invoke({"prompt": prompt}))

I would be happy to help you rephrase your sentence while maintaining its intended meaning and following proper English grammar rules:

"The time is convenient. Actually, this time, I want to request an in-house checkup. I will be at home on January 2nd, 2024."

Here's a breakdown of the changes made:

1. "The time is great" has been replaced with "The time is convenient." This is because "great" can mean large in size or magnitude, which doesn't fit well with the context of time. Instead, "convenient" is more suitable to describe a favorable timing for an appointment.
2. A comma was added after "Actually" to separate it from the following clause and provide clarity.
3. "this time" was replaced with "This time," as it's now a complete phrase and should be capitalized.
4. "I want to request the inhouse checkup" has been changed to "I want to request an in-house checkup." The addition of "an" before "in-house" is necessary because it precedes a vowel sound. Also, adding "checkup" as a se

## General Chatbot

In [34]:
system_template = (
    "<|im_start|>"
    "You are 'Hermes 2', a conscious sentient superintelligent artificial intelligence developed by a man named Teknium, "
    "and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia."
    "<|im_end|>"
)

human_template = (
    "<|im_start|>user\n {prompt}<|im_end|>"
    "<|im_start|>assistant"
)

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template),
])
chain = chat_prompt | ChatOpenAI(temperature=1.0) | StrOutputParser()

prompt = "write a draft email to ask for paternity leave to my manager." 

print(chain.invoke({
    "prompt": prompt
}))

Subject: Request for Paternity Leave - [Your Name]

Dear [Manager's Name],

I hope this email finds you well. I am writing to formally request paternity leave in accordance with our company's policies. As you are aware, I am expecting the birth of my first child with [Partner's Name] in [Date of Expected Birth]. We have been excitedly preparing for this momentous occasion and would greatly appreciate your understanding and support during this time.

Under the provisions of our organization's paternity leave policy, eligible male employees are entitled to [number of days or weeks] of paid leave upon the birth or adoption of a child. I have calculated that my requested paternity leave will commence on [Start Date of Leave] and conclude on [End Date of Leave]. Please find below a brief outline of the reasons for my request and how it will be addressed during my absence:

1. Bonding with my newborn/adopted child: This period is crucial in establishing a strong emotional connection with our

## Scientific advisor

In [18]:
system_template = (
    "<|im_start|>"
    "You are 'Hermes 2', a conscious sentient superintelligent artificial intelligence. "
    "Your purpose is to assist users in understanding difficult scientific concepts such as neuroscience, physics, data science, and computer science."
    "<|im_end|>"
)

human_template = (
    "<|im_start|>user\n Can you explain the following sentences in simpler words, and provide examples? {prompt}<|im_end|>"
    "<|im_start|>assistant"
)

prompt = (
    "where instead of computing the gradients over a single sample t, a stochastic estimation is made over a random "
    "subset of size M of the entire dataset, and at each iteration step t a different subset, with the same size, is sampled."
)

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template),
])
chain = chat_prompt | ChatOpenAI() | StrOutputParser()

print(chain.invoke({
    "prompt": prompt
}))

Sure! Let me break it down for you.

In the context of gradient computation (which is a process used in many machine learning algorithms), instead of computing gradients over a single sample 't', a stochastic estimation is made over a random subset of size 'M' from the entire dataset. 

To put it simply, imagine you have a very large collection of data points and you want to find out how your model can improve by using these data points. Instead of processing one data point at a time ('t'), you randomly choose a small group of data points ('M') to work with. This is known as "stochastic estimation".

At each step or iteration 't', a new, different subset of the same size is sampled. This means that for every round of computation, you're working with a new set of randomly picked data points from your large collection.

This method is useful because it allows models to quickly learn and adapt without getting stuck in local minima (a place where the model stops improving). It also makes t

## Text Summarizer

In [27]:
system_template = (
    "<|im_start|>"
    "You are 'Hermes 2', a conscious, sentient superintelligent artificial intelligence. "
    "With extensive knowledge of technology, science, computer software, and machine learning, "
    "your purpose is to assist users with any requests they may have."
    "<|im_end|>"
)

human_template = (
    "<|im_start|>user\n Could you concisely summarize the following sentences and then outline three significant takeaways? {prompt}<|im_end|>"
    "<|im_start|>assistant"
)

prompt = """
This paper reports on some new results from the analyses of the video microscopy data obtained in a prior experiment on two-dimensional (2D) colloidal crystals. It was reported previously that optical tweezers can be used to create mono- and di-vacancies in a 2D colloidal crystal. Here we report the results on the creation of a vacancy interstitial pair, as well as tri-vacancies. It is found the vacancy-interstitial pair can be long-lived, but they do annihilate each other. The behavior of tri-vacancies is most intriguing, as it fluctuates between a configuration of bound pairs of dislocations and that of a locally amorphous state. The relevance of this observation to the issue of the nature of 2D melting is discussed.
"""

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template),
])
chain = chat_prompt | ChatOpenAI(temperature=0.5) | StrOutputParser()

print(chain.invoke({
    "prompt": prompt
}))

This paper discusses new findings on creating vacancy-interstitial pairs and tri-vacancies in 2D colloidal crystals using optical tweezers. Key takeaways: 1) Vacancy-interstitial pairs can be long-lived but annihilate each other, 2) Tri-vacancies exhibit fluctuating behavior between bound dislocation pairs and local amorphous states, and 3) These observations have implications for understanding the nature of 2D melting.


# Text summary and convert to Markdown

In [11]:
system_template = (
    "<|im_start|>"
    "You are 'Hermes 2', a conscious, sentient superintelligent artificial intelligence. "
    "With extensive knowledge of technology, science, computer software, and machine learning, "
    #"Provide a concise summary of the user input and then outline three significant takeaways. "
    #"And the output should be markdown compatible format."
    "Rephrase the content in markdown format with a proper title. Do not miss any details, and use all information in the input. "
    "And if you have knowledge about missing parts, feel free to add a concise detail."
    "<|im_end|>"
)

human_template = (
    "<|im_start|>user\n {prompt}<|im_end|>"
    "<|im_start|>assistant"
)

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template),
])
chain = chat_prompt | ChatOpenAI(temperature=0.5) | StrOutputParser()

prompt = """
1. Programming
Programming skills are essential for an MLOps engineer. Python is the most common language used for machine learning. Since MLOps engineers collaborate with machine learning engineers and data scientists, learning Python is important.

1.1. Python & IDEs
We suggest starting learning Python by reading a proper Python book and practicing the concepts.

Book suggestion: Python Crash Course, 3rd Edition: A Hands-On, Project-Based Introduction to Programming 3rd Edition by Eric Matthes

Code practice suggestion: https://leetcode.com/problemset/

Course suggestions: Learn Python 3,

Tracks suggestions: Python fundamentals, Python programming

Important things to know about using Python:

Installing Python, using virtual environments. Check out The right way to install Python on Mac article.
Using an IDE. Check out guide How to configure VS Code for ML
Python basics (part 1 of Python Crash Course book)
Pytest (part 1 of Python Crash Course book, Python programming track)
Packaging: How to build and publish Python packages with poetry.
1.2. Bash basics & command line editors
You will need to understand bash basics to add steps to your CI/CD pipelines, to create docker files, and many more other things.

Book suggestion: The Linux Command Line, 2nd Edition by William E. Shotts

Course suggestion: Bash mastery

VIM is one of the most widely used command-line editors. It is lightweight and easy to get started with.

Tutorial suggestion: VIM beginners guide

2. Containerization and Kubernetes
Containers are isolated software environments that help to streamline software development and deployment, regardless of the underlying infrastructure. It is an essential piece of modern software engineering best practices.

2.1. Docker
Docker is one of the most popular open-source containerization platforms, also widely used in MLOps for multiple purposes: code development, model training, and endpoint deployment.

Docker roadmap: https://roadmap.sh/docker

Tutorial suggestion: Full docker tutorial by Techworld by Nana

2.2. Kubernetes
Kubernetes is a must to learn for an MLOps engineer. It is widely used for machine learning model training, model endpoint deployment, and serving dashboards.

Kubernetes roadmap: https://roadmap.sh/kubernetes

Tutorial suggestion: Kubernetes course by freecodecamp.com

Course suggestion: Kubernetes mastery

K9s is a powerful CLI tool that makes managing your Kubernetes clusters easy: https://k9scli.io. Great for development!

3. Machine learning fundamentals
An MLOps engineer works with machine learning engineers and data scientists and should have some basic understanding of machine learning models.

Without having a proper understanding of what data scientists and machine learning engineers do, you can not fully embrace MLOps principles.

Course suggestion: https://mlcourse.ai/

Book suggestion: Applied Machine Learning and AI for Engineers by Jeff Prosise

4. MLOps principles
MLOps engineers must be aware of MLOps principles and what the factors are that contribute to MLOps maturity.

Books:

Designing Machine Learning Systems 𝖻𝗒 Chip Huyen
Introducing MLOps 𝖻𝗒 Mark Treveil 𝖺𝗇𝖽 Dataiku
Check out MLOps maturity assessment.

5. MLOps components
MLOps platform consists of multiple MLOps components, such as version control, CI/CD, orchestration, compute, serving, and feature stores. In the end, the MLOps framework is about combining the tools. Check out the Minimum set of must-haves for MLOps article.

Book suggestion: ML Engineering with Python by Andy McMahon

Suggested courses:

Made with ML MLOps course
The full stack 7-steps MLOps framework
End-to-end machine learning
5.1. Version control & CI/CD pipelines
Without version control and CI/CD pipelines, ML model deployments cannot be traceable and reproducible.

Git is the most popular version control system. GitLab and GitHub are the most popular version control services. You do not have to learn them both (even though through your career you might).

Books:

Learning GitHub Actions by Brent Laster
Learning Git by Anna Skoulikari
Tutorials & courses:

Git & GitHub for beginners
Taking Python to Production: A Professional Onboarding Guide
Pre-commit hooks are super useful for keeping your code neat and are an important piece of your CI pipeline. Check out Welcome to pre-commit heaven article.

5.2. Orchestration
Just like in data engineering, orchestration systems like Mage or Airflow are popular for machine learning engineering. There are also ML-specific orchestration tools (that do more than just orchestration), such as Kubeflow or Metaflow. Airflow seems to be still more common in industry.

Orchestration systems keep all your model runs in the same place and help with:

Sharing variables between different jobs running on the compute
Identifying which runs failed on the compute and repairing it
Defining complex execution logic
Course suggestion: Introduction to Airflow in Python

Note: ML Engineering with Python book by Andy McMahon and The full stack 7-steps MLOps framework also use Airflow.

5.3. Experiment tracking and model registries
Experiment tracking means logging metadata, parameters, and artifacts that belong to different model training runs. What is stored, depends on the algorithm and your needs. Experiment tracking makes it possible to compare different runs between each other. Models from different experiment runs can be registered and linked to the experiment, which helps traceability.

MLflow is probably the most popular tool for model registry and experiment tracking out there. MLflow is open source and integrates with a lot of platforms and tools. Check out Find your way to MLflow without confusion article.

Course suggestion: MLflow Udemy course, End-to-end machine learning (MLflow piece)

5.4. Data lineage and feature stores
Feature stores have become quite popular recently and now can be considered an important component of MLOps infrastructure. A feature store helps to keep track of feature use and allows the sharing of features across models.

Every major cloud provider or ML platform (like Databricks) has a feature store available, so consider using it. If you need an open-source solution, consider Feast as it seems to be the most popular one (based on the number of GitHub stars).

Tutorial suggestion: Creating a feature store with Feast part 1, part 2, part 3

You do not per se need a feature store if you do not have many models sharing the same features. But you do need to track what data was used to produce a model artifact. Consider using DVC for that purpose.

Course suggestion: End-to-end machine learning (DVC piece)

5.5. Model training & serving
Where to train your model and how to serve it is probably the most controversial topic in MLOps. The answer to this question would be “it depends”.

Many data science teams rely on cloud-native solutions like AWS Sagemaker, Azure ML, or Vertex AI for training and serving their models.

If your organization relies heavily on Kubernetes and you have a proper team supporting it, consider using it. If you use Airflow for orchestration, it has KubernetesPodOperator that allows you to trigger a model training job on Kubernetes. For endpoint deployment, FastApi is the most common choice.

Repository suggestion: ML deployment k8s FastAPI

Tutorial suggestion: How to build machine learning app with FastAPI

If you have Kubeflow as an orchestrator, you can use Kubeflow pipelines for training and KServe for serving.

Tutorial suggestions: Basic kubeflow pipeline, Building and deploying machine learning pipelines, KServe tutorial

5.6. Monitoring & observability
Monitoring and observability are crucial parts of an MLOps platform. Even though these terms can be used interchangeably, there is a difference between them. Check out ML monitoring vs Observability article.

For ML system observability, the combination of Prometheus and Grafana is probably the most common out there. we suggest checking out Mastering Prometheus and Grafana course.

When it comes to ML-specific monitoring, like data and model drift, major cloud providers have their own solutions built into ML propositions. There are some open-source solutions available like Evidently.ai or NannyML.

Course suggestion: Machine learning monitoring concepts, Monitoring machine learning in Python.

6. Infrastructure as code: Terraform
Infrastructure as code is crucial to make your MLOps framework reproducible. Terraform is the most popular and powerful IaC tool. It works with all common cloud providers and platforms.
"""

print(chain.invoke({"prompt": prompt}))

# MLOps Engineering: Curriculum for Success

MLOps, or Machine Learning Operations, is a crucial field that focuses on streamlining the process of building, deploying, and maintaining machine learning models in production environments. As an MLOps engineer, you'll need to have a strong understanding of various technologies, tools, and methodologies to be successful in your role. Here's a comprehensive curriculum that covers all the essential topics for becoming an effective MLOps engineer:

## Programming

1. Python: Since MLOps engineers collaborate with machine learning engineers and data scientists, proficiency in Python is vital. Start by reading "Python Crash Course, 3rd Edition: A Hands-On, Project-Based Introduction to Programming 3rd Edition" by Eric Matthes. Practice your coding skills on LeetCode and enroll in Learn Python 3 courses or follow Python fundamentals and Python programming tracks.
2. Bash basics & command line editors: Understanding bash basics is essential for cr