<a href="https://colab.research.google.com/github/andygma567/LLM-experiments/blob/main/Test_mlflow_%2B_Palm2_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is a test to integration mlflow with my Langchain chains. Later I'd like to study more about this [web scraping with an LLM example](https://python.langchain.com/docs/use_cases/web_scraping/)

## Setup



In [1]:
%%bash
pip install -U -q google-generativeai # PALM API library
pip install -U -q langchain
pip install -q unstructured # for reading urls with langchain
pip install -q transformers # needed by the summary chain

# mlflow things
pip install -q mlflow
pip install pydantic==1.* # test if this works with pydantic 2 later
pip install -q pyngrok

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 122.9/122.9 kB 4.6 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 113.3/113.3 kB 12.1 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 20.6 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.4/49.4 kB 5.7 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 30.9 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 358.9/358.9 kB 34.7 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.6/7.6 MB 79.7 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 294.8/294.8 kB 28.7 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 123.1 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 78.3 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.5/18.5 MB 30.3 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.5/83.5 kB 10.6 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# Write some environment files

These are for in case I am not working inside of colab. For personal projects this is probably overkill.

In [2]:
# Write a requirements.txt file
# I don't use pip freeze > requirements.txt because
# colab installs a ton of extra libraries that I don't actually need
text = """
pandas>=1.5
mlflow
transformers
langchain
unstructured
pydantic==1.*
pyngrok
google-generativeai
"""
with open("requirements.txt", "w") as f:
    f.write(text)

It would be interesting to see if I could install using my requirements.txt file

In [3]:
# Write a conda environment yaml - I used ChatGPT
# I might not actually need this but I'll include it just to be safe
# By default, conda is not install in the colab notebook - because colab runs
# docker images
text = """
name: myenv
channels:
  - defaults
dependencies:
  - python>=3.10
  - pip
  - pip:
    - -r requirements.txt
"""
with open("conda.yaml", "w") as f:
    f.write(text)

In [4]:
# Write an MLproject file
# it doesn't have much use because I don't have a main python script but it
# could be useful in the future...
text= '''
name: mlflow + langchain experiment

conda_env: conda_environment.yaml

entry_points:
  main:
    command: "python3 print('hello')"
'''
with open("MLproject", "w") as f:
    f.write(text)

# Set up the langchain PALM integration

To get started, you'll need to [create an API key](https://developers.generativeai.google/tutorials/setup). I'm using the [langchain integration](https://api.python.langchain.com/en/latest/chat_models/langchain.chat_models.google_palm.ChatGooglePalm.html#langchain.chat_models.google_palm.ChatGooglePalm).

In [5]:
import os
from langchain.llms.google_palm import GooglePalm
from langchain.chains.summarize import load_summarize_chain

MY_API_KEY = 'AIzaSyBCopn5tdSQBN659Z_0GqvY5S-E7ywnh-4'
os.environ['GOOGLE_API_KEY'] = MY_API_KEY

llm = GooglePalm(temperature=0,
                 max_output_tokens=1024,
                 )
chain = load_summarize_chain(llm=llm, chain_type="stuff")

# Try Summarization

[Lang chain summarization example](https://python.langchain.com/docs/use_cases/summarization)

[Reference for PALM2 models](https://developers.generativeai.google/models/language#:~:text=Note%3A%20For%20the%20PaLM%202,about%2060%2D80%20English%20words).

## Load and split data

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# PALM2 has a roughly 8k token input
# but the PALM API can only take about 20k bytes
# 1 bytes ~ 1 char
# 4 char ~ 1 token
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=0)

In [7]:
import textwrap
from langchain.document_loaders import (UnstructuredURLLoader, \
                                        WebBaseLoader, \
                                        )
urls = [
    # this only works with webbased I think
    # "https://sites.google.com/view/mnovackmath/home",
    "https://sites.google.com/view/mnovack",
    ]
# loader = UnstructuredURLLoader(urls=urls)
loader = WebBaseLoader(web_path=urls)

docs = loader.load_and_split(text_splitter=text_splitter)
# The replace_whitespace = True is better for UnstructuredURLLoader
# and False is better for the WebBaseLoader
print(f"Total number of documents: {len(docs)}\n")
print(f"Num chars per doc: {len(docs[0].page_content)}\n")
print(textwrap.fill(docs[0].page_content, max_lines=10))

Total number of documents: 2

Num chars per doc: 1992

Michael NovackSearch this siteSkip to main contentSkip to
navigationMichael NovackMichael  NovackPostdoctoral Research Associate
at Carnegie Mellon UniversityEmail address: mnovack at andrew dot cmu
dot eduPersonal InfoI am a postdoc at Carnegie Mellon University,
where my mentors are Irene Fonseca and Giovanni Leoni . I am
interested in the calculus of variations, geometric measure theory,
and partial differential equations.Previously, I was a postdoc at the
University of Texas at Austin with Francesco Maggi and the University
of Connecticut with Xiaodong Yan . I completed my doctoral studies at
Indiana University under the supervision of Peter Sternberg  and [...]


## Run a summarization chain + mlflow

This is a nice reference: [LLMOps: Experiment Tracking with MLflow for Large Language Models
](https://dagshub.com/blog/mlflow-support-for-large-language-models/)

- I need to figure out how to use the `mlflow.evaluate()` later, for now I have enough to work with and the evealuate is an experimental feature anyways
- Maybe later I can try running the [mlflow example from the docs](https://mlflow.org/docs/latest/models.html#evaluating-with-llms)

This is some `mlflow.evaluate()` code that didn't work for me earlier
```
# This is formatted as code
# try to log a table using mlflow.evaluate()
# use model type="text" bc "summarization" generates extra metrics

# Use the pandas.DataFrame constructor to create a new DataFrame from the list of strings
# I had to check the model signature to see that the name of the input is defaulted to
# "input_documents"

# For some reason this mlflow.evaluate() doesn't work for me...
# I can double check this another time

# df = pd.DataFrame(data=inputs, columns=["input_documents"])
# print(df)

# mlflow.evaluate(
#     model=logged_model.model_uri,
#     model_type="text",
#     data=df,
#     )
```

In [8]:
%%time
# my manual test
import langchain
import textwrap
import mlflow
from pprint import pp
import pandas as pd

mlflow.set_tracking_uri('')
experiment = mlflow.set_experiment('Langchain + mlflow')

# Only the first 2k characters of Matt's webpage can be passed to the API
# otherwise it raises an error - I have never known why this is but I assume
# it's because the PALM API is not very good

urls = [
    "https://sites.google.com/view/mnovackmath/home",
    "https://sites.google.com/view/mnovack",
    "https://math.gmu.edu/~scarney6/index.html", # Sean's website
    ]

for website in urls:
    print()
    print(website)
    loader = WebBaseLoader(web_path=website)
    docs = loader.load_and_split(text_splitter=text_splitter)

    with mlflow.start_run():
        # log the number of docs
        params = {'num_docs': len(docs),
                  'website': website,
                  }
        mlflow.log_params(params)

        # log the prediction
        inputs = [docs[0].page_content]
        outputs = [chain.run(docs[:1])]
        prompts = [chain.llm_chain.prompt.template]

        model_info = mlflow.llm.log_predictions(inputs, outputs, prompts)

        # see docs:
        # https://mlflow.org/docs/latest/python_api/mlflow.langchain.html#mlflow.langchain.log_model
        # by default this flavor can infer the signature from the chain
        # which appears to be good enough for my uses
        # but we can also explicitly pass an input example
        # it infers a signature from the input example

        # log the model, I can use the infer signature later if I want
        logged_model = mlflow.langchain.log_model(chain,
                                                  "langchain_summary_chain",
                                                  )

        # I think the artifact view for comparing runs currently only works well for
        #  table artifacts, so I need to use the mlflow.log_table() function
        data_dict = {
            'prompts': prompts,
            'inputs': inputs,
            'outputs': outputs,
        }

        df = pd.DataFrame(data_dict)
        mlflow.log_table(data=df, artifact_file="prediction_results.json")

2023/09/15 23:54:22 INFO mlflow.tracking.fluent: Experiment with name 'Langchain + mlflow' does not exist. Creating a new experiment.



https://sites.google.com/view/mnovackmath/home


2023/09/15 23:54:24 INFO mlflow.tracking.llm_utils: Creating a new llm_predictions.csv for run b9fe22f1c93a43b293805dabead35440.



https://sites.google.com/view/mnovack


2023/09/15 23:54:30 INFO mlflow.tracking.llm_utils: Creating a new llm_predictions.csv for run 7dcb5a4c40f34c379b24904bd954072e.



https://math.gmu.edu/~scarney6/index.html


2023/09/15 23:54:39 INFO mlflow.tracking.llm_utils: Creating a new llm_predictions.csv for run 427d9f5d1dea40328ce3cc46df52d55a.


CPU times: user 2.06 s, sys: 414 ms, total: 2.47 s
Wall time: 20.6 s


# Register a model

In [9]:
# try out programatically registering the last run
run = mlflow.last_active_run()

mv = mlflow.register_model(f"runs:/{run.info.run_id}/langchain_summary_chain", "model_A")
print(f"Name: {mv.name}")
print(f"Version: {mv.version}")

Successfully registered model 'model_A'.
2023/09/15 23:55:44 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: model_A, version 1


Name: model_A
Version: 1


Created version '1' of model 'model_A'.


# Fetch the registered model

In [10]:
model_name = "model_A"
model_version = 1

model_uri = f"models:/{model_name}/{model_version}"

model = mlflow.pyfunc.load_model(model_uri=model_uri)
print(model)

# print the dependencies
print()
file_path = mlflow.pyfunc.get_model_dependencies(model_uri=model_uri, format='pip')
with open(file_path, 'r') as file:
        content = file.read()
        print(content)

2023/09/15 23:55:52 INFO mlflow.pyfunc: To install the dependencies that were used to train the model, run the following command: '%pip install -r /content/mlruns/779359387179644347/427d9f5d1dea40328ce3cc46df52d55a/artifacts/langchain_summary_chain/requirements.txt'.


mlflow.pyfunc.loaded_model:
  artifact_path: langchain_summary_chain
  flavor: mlflow.langchain
  run_id: 427d9f5d1dea40328ce3cc46df52d55a


mlflow==2.7.0
langchain==0.0.292


I have a problem with creating a pandas dataframe myself. This is also probably why I couldn't get the model evaluate to work with me.

I think the problem is that the summary chain is a list of docs. I have a warning from mlflow saying: `WARNING mlflow: MLflow does not guarantee support for Chains outside of the subclasses of LLMChain, found StuffDocumentsChain`

From checking the mlflow docs of `mlflow.models.infer_signature()` it appears that general object types are not supported as a datatype and from reading the examples of subclasssing the pyfunc flavor - I don't see any easy way to pass along lists of objects as input.

For now, I will only use the log predictions method to log the prompts and I will be unable load and use summary chains as pyfunc models (creating a custom flavor for this was also considered but it is too much effort).

In [11]:
model_info = mlflow.models.get_model_info(model_uri)
print(model_info.signature)

inputs: 
  ['input_documents': string]
outputs: 
  ['output_text': string]
params: 
  None



# Set up the UI

In [None]:
import os
os.system("mlflow ui &")

In [None]:
from pyngrok import ngrok

# Terminate open tunnels if exist
ngrok.kill()

# Setting the authtoken (optional)
# Get your authtoken from https://dashboard.ngrok.com/auth
NGROK_AUTH_TOKEN = "2Tw0NPiESsNXEJoEZgShvindbK8_3w9U4iGq7pou7V12dDbmQ"
ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# Open an HTTPs tunnel on port 5000 for http://localhost:5000
public_url = ngrok.connect("5000")

# public_url = ngrok.connect(port="5000", proto="http", options={"bind_tls": True})
print("MLflow Tracking UI:", public_url)