# Inspect
The inspect module provides several useful functions to help get information about live objects such as modules, classes, methods, functions, tracebacks, frame objects, and code objects. For example, it can help you examine the contents of a class, retrieve the source code of a method, extract and format the argument list for a function, or get all the information you need to display a detailed traceback.

There are four main kinds of services provided by this module: type checking, getting source code, inspecting classes and functions, and examining the interpreter stack.

In [0]:
import warnings

# Disable a few less-than-useful UserWarnings from setuptools and pydantic
warnings.filterwarnings("ignore", category=UserWarning)

In [0]:
import functools
import inspect
import os
import textwrap

import openai

import mlflow
from mlflow.models.signature import ModelSignature
from mlflow.pyfunc import PythonModel
from mlflow.types.schema import ColSpec, ParamSchema, ParamSpec, Schema

import mlflow

In [0]:
OPENAI_API_KEY= dbutils.secrets.get(scope= "databricks-azure", key = "OPENAIAPIKEY")

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

In [0]:
assert "OPENAI_API_KEY" in os.environ, "Please set the OPENAI_API_KEY environment variable."

In [0]:
mlflow.set_experiment("/Users/olonok@hotmail.com/Code Helper 2")


2024/06/16 21:04:52 INFO mlflow.tracking.fluent: Experiment with name '/Users/olonok@hotmail.com/Code Helper 2' does not exist. Creating a new experiment.


<Experiment: artifact_location='dbfs:/databricks/mlflow-tracking/2965825698891942', creation_time=1718571893117, experiment_id='2965825698891942', last_update_time=1718571893117, lifecycle_stage='active', name='/Users/olonok@hotmail.com/Code Helper 2', tags={'mlflow.experiment.sourceName': '/Users/olonok@hotmail.com/Code Helper 2',
 'mlflow.experimentType': 'MLFLOW_EXPERIMENT',
 'mlflow.ownerEmail': 'olonok@hotmail.com',
 'mlflow.ownerId': '1491868126462402'}>

In [0]:
instruction = [
    {
        "role": "system",
        "content": (
            "As an AI specializing in code review, your task is to analyze and critique the submitted code. For each code snippet, provide a detailed review that includes: "
            "1. Identification of any errors or bugs. "
            "2. Suggestions for optimizing code efficiency and structure. "
            "3. Recommendations for enhancing code readability and maintainability. "
            "4. Best practice advice relevant to the codeâ€™s language and functionality. "
            "Your feedback should help the user improve their coding skills and understand best practices in software development."
        ),
    },
    {"role": "user", "content": "Review my code and suggest improvements: {code}"},
]

In [0]:
# Define the model signature that will be used for both the base model and the eventual custom pyfunc implementation later.
signature = ModelSignature(
    inputs=Schema([ColSpec(type="string", name=None)]),
    outputs=Schema([ColSpec(type="string", name=None)]),
    params=ParamSchema(
        [
            ParamSpec(name="max_tokens", default=500, dtype="long"),
            ParamSpec(name="temperature", default=0, dtype="float"),
        ]
    ),
)

# Log the base OpenAI model with the included instruction set (prompt)
with mlflow.start_run() as run:
    model_info = mlflow.openai.log_model(
        model="gpt-4",
        task=openai.chat.completions,
        artifact_path="base_model",
        messages=instruction,
        signature=signature,
    )

Uploading artifacts:   0%|          | 0/10 [00:00<?, ?it/s]

2024/06/16 21:08:14 INFO mlflow.store.artifact.cloud_artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


In [0]:
model_info.model_uri

'runs:/c64b496d34ba43d4b98944ba44bf21d8/base_model'

In [0]:
# Custom pyfunc implementation that applies text and code formatting to the output results from the OpenAI model
class CodeHelper(PythonModel):
    def __init__(self):
        self.model = None

    def load_context(self, context):
        self.model = mlflow.pyfunc.load_model(context.artifacts["model_path"])

    @staticmethod
    def _format_response(response):
        formatted_output = ""
        in_code_block = False

        for item in response:
            lines = item.split("\n")
            for line in lines:
                # Check for the start/end of a code block
                if line.strip().startswith("```"):
                    in_code_block = not in_code_block
                    formatted_output += line + "\n"
                    continue

                if in_code_block:
                    # Don't wrap lines inside code blocks
                    formatted_output += line + "\n"
                else:
                    # Wrap lines outside of code blocks
                    wrapped_lines = textwrap.fill(line, width=80)
                    formatted_output += wrapped_lines + "\n"

        return formatted_output

    def predict(self, context, model_input, params):
        # Call the loaded OpenAI model instance to get the raw response
        raw_response = self.model.predict(model_input, params=params)

        # Return the formatted response so that it is easier to read
        return self._format_response(raw_response)

In [0]:
import datetime

In [0]:
model_info.model_uri

'runs:/c64b496d34ba43d4b98944ba44bf21d8/base_model'

In [0]:
# Define the location of the base model that we'll be using within our custom pyfunc implementation
artifacts = {"model_path": model_info.model_uri}
runname = f'code_helper_{datetime.datetime.now().strftime("%Y-%m-%d_%H:%M:%S")}'
with mlflow.start_run(run_name=runname) as run:
    helper_model = mlflow.pyfunc.log_model(
        artifact_path="code_helper",
        python_model=CodeHelper(),
        input_example=["x = 1"],
        signature=signature,
        artifacts=artifacts,
    )

2024/06/16 21:18:08 INFO mlflow.models.utils: Lists of scalar values are not converted to a pandas DataFrame. If you expect to use pandas DataFrames for inference, please construct a DataFrame and pass it to input_example instead.


Downloading artifacts:   0%|          | 0/10 [00:00<?, ?it/s]

2024/06/16 21:18:09 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


Uploading artifacts:   0%|          | 0/31 [00:00<?, ?it/s]

2024/06/16 21:18:29 INFO mlflow.store.artifact.cloud_artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


In [0]:
helper_model.model_uri

'runs:/47b84e1d2f074a6b84ff1595368bed3a/code_helper'

In [0]:
loaded_helper = mlflow.pyfunc.load_model(helper_model.model_uri)

Downloading artifacts:   0%|          | 0/31 [00:00<?, ?it/s]

2024/06/16 21:20:13 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


In [0]:
help(loaded_helper)

Help on PyFuncModel in module mlflow.pyfunc object:

class PyFuncModel(builtins.object)
 |  PyFuncModel(model_meta: mlflow.models.model.Model, model_impl: Any, predict_fn: str = 'predict')
 |  
 |  MLflow 'python function' model.
 |  
 |  Wrapper around model implementation and metadata. This class is not meant to be constructed
 |  directly. Instead, instances of this class are constructed and returned from
 |  :py:func:`load_model() <mlflow.pyfunc.load_model>`.
 |  
 |  ``model_impl`` can be any Python object that implements the `Pyfunc interface
 |  <https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#pyfunc-inference-api>`_, and is
 |  returned by invoking the model's ``loader_module``.
 |  
 |  ``model_meta`` contains model metadata loaded from the MLmodel file.
 |  
 |  Methods defined here:
 |  
 |  __eq__(self, other)
 |      Return self==value.
 |  
 |  __init__(self, model_meta: mlflow.models.model.Model, model_impl: Any, predict_fn: str = 'predict')
 |      Initiali

In [0]:
run.to_dictionary()

{'info': {'artifact_uri': 'dbfs:/databricks/mlflow-tracking/2965825698891942/47b84e1d2f074a6b84ff1595368bed3a/artifacts',
  'end_time': None,
  'experiment_id': '2965825698891942',
  'lifecycle_stage': 'active',
  'run_id': '47b84e1d2f074a6b84ff1595368bed3a',
  'run_name': 'code_helper_2024-06-16_21:18:07',
  'run_uuid': '47b84e1d2f074a6b84ff1595368bed3a',
  'start_time': 1718572687755,
  'status': 'RUNNING',
  'user_id': ''},
 'data': {'metrics': {},
  'params': {},
  'tags': {'mlflow.databricks.cluster.id': '0616-184429-cxigo2mp',
   'mlflow.databricks.notebook.commandID': '1718565613149_7315845277717661672_e9139b2a5e2049ec9f52cdfd6e25b1b9',
   'mlflow.databricks.notebookID': '1200508474543255',
   'mlflow.databricks.notebookPath': '/Users/olonok@hotmail.com/Databricks LLMOps 2',
   'mlflow.databricks.webappURL': 'https://ukwest.azuredatabricks.net',
   'mlflow.databricks.workspaceID': '1286930193882465',
   'mlflow.databricks.workspaceURL': 'adb-1286930193882465.5.azuredatabricks.ne

In [0]:
# Define the name for the model in the Model Registry.
# We filter out some special characters which cannot be used in model names.
user= "olonok@hotmail.com"
model_name = f"code_helper - {user}"
model_name = model_name.replace("/", "_").replace(".", "_").replace(":", "_")
print(model_name)

code_helper - olonok@hotmail_com


In [0]:
helper_model.model_uri

'runs:/47b84e1d2f074a6b84ff1595368bed3a/code_helper'

In [0]:
# Register a new model under the given name, or a new model version if the name exists already.
mlflow.register_model(model_uri=helper_model.model_uri, name=model_name)

Registered model 'code_helper - olonok@hotmail_com' already exists. Creating a new version of this model...
2024/06/16 21:21:32 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: code_helper - olonok@hotmail_com, version 4
Created version '4' of model 'code_helper - olonok@hotmail_com'.


<ModelVersion: aliases=[], creation_timestamp=1718572892011, current_stage='None', description='', last_updated_timestamp=1718572892011, name='code_helper - olonok@hotmail_com', run_id='47b84e1d2f074a6b84ff1595368bed3a', run_link='', source='dbfs:/databricks/mlflow-tracking/2965825698891942/47b84e1d2f074a6b84ff1595368bed3a/artifacts/code_helper', status='PENDING_REGISTRATION', status_message='', tags={}, user_id='1491868126462402', version='4'>

# Test the LLM pipeline

In [0]:
from mlflow import MlflowClient

client = MlflowClient()

In [0]:
client.search_registered_models(filter_string=f"name = '{model_name}'")

[<RegisteredModel: aliases={}, creation_timestamp=1718566410571, description='', last_updated_timestamp=1718572892011, latest_versions=[<ModelVersion: aliases=[], creation_timestamp=1718570009657, current_stage='Archived', description='', last_updated_timestamp=1718570503610, name='code_helper - olonok@hotmail_com', run_id='f060faa5186442e8adebac206d8c7391', run_link='', source='dbfs:/databricks/mlflow-tracking/1200508474543296/f060faa5186442e8adebac206d8c7391/artifacts/base_model', status='READY', status_message='', tags={}, user_id='olonok@hotmail.com', version='2'>,
  <ModelVersion: aliases=[], creation_timestamp=1718572892011, current_stage='None', description='', last_updated_timestamp=1718572897211, name='code_helper - olonok@hotmail_com', run_id='47b84e1d2f074a6b84ff1595368bed3a', run_link='', source='dbfs:/databricks/mlflow-tracking/2965825698891942/47b84e1d2f074a6b84ff1595368bed3a/artifacts/code_helper', status='READY', status_message='', tags={}, user_id='olonok@hotmail.com',

In [0]:
model_version = 3
dev_model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")
dev_model

Downloading artifacts:   0%|          | 0/31 [00:00<?, ?it/s]

2024/06/16 21:23:51 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


mlflow.pyfunc.loaded_model:
  artifact_path: code_helper
  flavor: mlflow.pyfunc.model
  run_id: bf679081e9a04987b0252bbe15e58f5b

In [0]:
client.transition_model_version_stage(model_name, model_version, "Archived")

  client.transition_model_version_stage(model_name, model_version, "Archived")


<ModelVersion: aliases=[], creation_timestamp=1718570466836, current_stage='Archived', description='', last_updated_timestamp=1718573039000, name='code_helper - olonok@hotmail_com', run_id='bf679081e9a04987b0252bbe15e58f5b', run_link='', source='dbfs:/databricks/mlflow-tracking/1200508474543296/bf679081e9a04987b0252bbe15e58f5b/artifacts/code_helper', status='READY', status_message='', tags={}, user_id='1491868126462402', version='3'>

In [0]:
model_version = 4
dev_model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")
dev_model

Downloading artifacts:   0%|          | 0/31 [00:00<?, ?it/s]

2024/06/16 21:24:19 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


mlflow.pyfunc.loaded_model:
  artifact_path: code_helper
  flavor: mlflow.pyfunc.model
  run_id: 47b84e1d2f074a6b84ff1595368bed3a

In [0]:
staging_model = dev_model

In [0]:
def review(func, model):
    """
    Function to review the source code of a given function using a specified MLflow model.

    Args:
    func (function): The function to review.
    model (MLflow pyfunc model): The MLflow pyfunc model used for evaluation.

    Returns:
    The model's prediction or an error message.
    """
    try:
        # Extracting the source code of the function
        source_code = inspect.getsource(func)

        # Using the model to predict/evaluate the source code
        prediction = model.predict([source_code])
        print(prediction)
    except Exception as e:
        # Handling any exceptions that occur and returning an error message
        return f"Error during model prediction or source code inspection: {e}"

In [0]:
def process_data(lst):
    s = 0
    q = []
    for i in range(len(lst)):
        a = lst[i]
        for j in range(i + 1, len(lst)):
            b = lst[j]
            if a == b:
                s += 1
            else:
                q.append(b)
    rslt = [x for x in lst if x not in q]
    k = []
    for i in rslt:
        if i not in k:
            k.append(i)
    final_data = sorted(k, reverse=True)
    return final_data, s


review(process_data, staging_model)

Here's a review of your code:

1. Errors or bugs: There are no syntax errors in your code, but the logic seems
to be flawed. The code is supposed to count the number of duplicate elements in
the list and return a list of unique elements sorted in descending order.
However, the way you're checking for duplicates and creating the list of unique
elements is incorrect and inefficient.

2. Optimizing code efficiency and structure: The current code has a time
complexity of O(n^2) due to the nested for loops and the use of the 'in'
operator in a list, which is inefficient for large lists. You can use Python's
built-in data structures like set and Counter from collections module to
optimize this.

3. Enhancing code readability and maintainability: The variable names are not
descriptive, which makes the code hard to understand. Using meaningful variable
names can greatly improve code readability.

4. Best practice advice: It's a good practice to add docstrings to your
functions to explain what 

In [0]:
client.transition_model_version_stage(model_name, model_version, "production")

  client.transition_model_version_stage(model_name, model_version, "production")


<ModelVersion: aliases=[], creation_timestamp=1718572892011, current_stage='Production', description='', last_updated_timestamp=1718573378756, name='code_helper - olonok@hotmail_com', run_id='47b84e1d2f074a6b84ff1595368bed3a', run_link='', source='dbfs:/databricks/mlflow-tracking/2965825698891942/47b84e1d2f074a6b84ff1595368bed3a/artifacts/code_helper', status='READY', status_message='', tags={}, user_id='1491868126462402', version='4'>