# Nebius Managed Service for MLflow Demo

This demo shows you how key use cases of using the Nebius Managed Service for MLflow for AI development.

## Getting Started

1. Launch your instance of the Managed Service for MLflow with [the MLflow quickstart](https://docs.nebius.com/mlflow/quickstart).
2. Set up your API key to connect to Nebius AI Studio with [the AI Studio quickstart](https://docs.nebius.com/studio/inference/quickstart).

> **Note:** Launching an MLflow cluster may take 30-60 minutes to be fully provisioned and ready to use.

In [None]:
!pip install mlflow==2.20.2 python-dotenv openai

### Secrets and Environment Variables

You will need to set the following environment variables where this notebook is running, so that the code in the following cells can connect to both Nebius Managed Service for MLflow and Nebius AI Studio. 

MLflow:<br>
`MLFLOW_TRACKING_SERVER_CERT_PATH`<br>
`MLFLOW_TRACKING_URI`<br>
`MLFLOW_TRACKING_USERNAME`<br>
`MLFLOW_TRACKING_PASSWORD`<br>

AI Studio:<br>
`NEBIUS_API_KEY`

### Environment Setup

To set the environment variables, run the following cell. You may choose to set them interactively or by loading from a `.env` file.

In [None]:
%reload_ext autoreload
%autoreload 2

import os
import sys
from pathlib import Path
from dotenv import load_dotenv

# Add the parent directory to Python path
sys.path.append(str(Path.cwd().parent))

from env_setup import setup_env_from_file, setup_env_interactive, verify_env_setup

# Option 1: Interactive setup
# setup_env_interactive()

# Option 2: Load from .env file
setup_env_from_file('../.env')

# Verify the setup
verify_env_setup()

### Check connection to MLflow


In [None]:
import mlflow 

# List experiments in MLflow
mlflow.search_experiments()

### Set up Nebius AI Studio client

In [3]:
import openai

API_KEY = os.environ.get("NEBIUS_API_KEY")

# Instantiate the client instance
nebius_client = openai.OpenAI(api_key=API_KEY,
                              base_url="https://api.studio.nebius.ai/v1/")


# Example 1: MLflow Tracking Quickstart

https://mlflow.org/docs/latest/getting-started/intro-quickstart/

In [None]:
import mlflow
from mlflow.models import infer_signature

import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score


# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Define the model hyperparameters
params = {
    "solver": "lbfgs",
    "max_iter": 1000,
    "multi_class": "auto",
    "random_state": 8888,
}

# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)

# Predict on the test set
y_pred = lr.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)

### Log the model and its metadata to MLflow

- Initiate an MLflow run context to start a new run that we will log the model and metadata to.
- Log model parameters and performance metrics.
- Tag the run for easy retrieval.
- Register the model in the MLflow Model Registry while logging (saving) the model.


In [None]:
# Create a new MLflow Experiment
mlflow.set_experiment("MLflow Quickstart")

# Start an MLflow run
with mlflow.start_run():
    # Log the hyperparameters
    mlflow.log_params(params)

    # Log the loss metric
    mlflow.log_metric("accuracy", accuracy)

    # Set a tag that we can use to remind ourselves what this run was for
    mlflow.set_tag("Training Info", "Basic LR model for iris data")

    # Infer the model signature
    signature = infer_signature(X_train, lr.predict(X_train))

    # Log the model
    model_info = mlflow.sklearn.log_model(
        sk_model=lr,
        artifact_path="iris_model",
        signature=signature,
        input_example=X_train,
        registered_model_name="tracking-quickstart",
    )

In [None]:
model_info.model_uri

## Load the model as a Python Function (pyfunc) and use it for inference

- Loading the model using MLflow's pyfunc flavor.
- Running Predict on new data using the loaded model

In [None]:
# Load the model back for predictions as a generic Python Function model
loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)

loaded_model

In [None]:
# Get predictions 

predictions = loaded_model.predict(X_test)

iris_feature_names = datasets.load_iris().feature_names

result = pd.DataFrame(X_test, columns=iris_feature_names)
result["actual_class"] = y_test
result["predicted_class"] = predictions

result[:4]

# Example 2: MLflow Tracing for LLM Observability

Traces enhances LLM observability in your Generative AI (GenAI) applications by capturing detailed information about the execution details. 

https://mlflow.org/docs/latest/tracing/

## Automatic Tracing of LLM calls

In [None]:
!env | grep MLFLOW_TRACKING_SERVER_CERT_PATH



In [None]:
import mlflow
import openai

tracing_experiment =mlflow.set_experiment("MLflow Tracing")

# Enable MLflow automatic tracing for OpenAI with one line of code!
mlflow.openai.autolog()


# Time to call the LLM -- tracing is done automatically
nebius_client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct",
    temperature=0.95,
    messages=[
        {"role": "system", "content": "You are a chatbot."},
        {"role": "user", "content": "What is the weather like today?"},
    ],
)


## Manual Tracing

1. Instrument a function with @mlflow.trace decorator.
2. Instrument any block of code using mlflow.start_span context manager.
3. Grouping or annotating traces using a tag.
4. Disabling trace globally.

https://mlflow.org/docs/latest/tracing/#manual-tracing

In [7]:
import mlflow
from mlflow.entities import SpanType
import openai
import time

In [None]:
# Use MLflow tracing to trace the execution of a function

@mlflow.trace(span_type="func", attributes={"key": "value"})
def add_1(x):
    return x + 1


@mlflow.trace(span_type="func", attributes={"key": "value"})
def minus_1(x):
    return x - 1


@mlflow.trace(name="Trace Test")
def trace_test(x):
    step1 = add_1(x)
    return minus_1(step1)


trace_test(4)

In [None]:
# Integrate tracing into your LLM workflow

mlflow.openai.autolog()

@mlflow.trace(span_type=SpanType.CHAIN)
def run(question):
    messages = build_messages(question)
    # MLflow automatically generates a span for OpenAI invocation
    response = nebius_client.chat.completions.create(
        # model="gpt-4o-mini",
        model="meta-llama/Llama-3.3-70B-Instruct",
        max_tokens=100,
        messages=messages,
    )
    return parse_response(response)


@mlflow.trace
def build_messages(question):
    return [
        {"role": "system", "content": "You are a helpful chatbot."},
        {"role": "user", "content": question},
    ]

@mlflow.trace
def parse_response(response):
    return response.choices[0].message.content


run("What is MLflow?")

In [None]:
# Get the timestamp in milliseconds
one_hour_ago = int(time.time() - 3600) * 1000  # 3600 seconds = 1 hour in milliseconds

one_hour_ago

In [None]:
# Search and analyze traces

mlflow.search_traces(
    tracing_experiment.experiment_id, 
    filter_string=f"timestamp_ms < {one_hour_ago}",
)[:3]

# Example 3:  Tracing LangGraph 

- Tracing LangGraph with MLflow https://mlflow.org/docs/latest/tracing/integrations/langgraph 
- Example: Code generation with RAG and self-correction with https://langchain-ai.github.io/langgraph/tutorials/code_assistant/langgraph_code_assistant/

For this example set additional vars: 
- OPENAI_API_KEY
- ANTHROPIC_API_KEY

In [None]:
!pip install langchain_openai langchain langgraph langchain_core

In [None]:
from typing import Literal

import mlflow

from langchain_core.messages import AIMessage, ToolCall
from langchain_core.outputs import ChatGeneration, ChatResult
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

# Enabling tracing for LangGraph (LangChain)
mlflow.langchain.autolog()

# Optional: Set a tracking URI and an experiment
# mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("LangGraph")


@tool
def get_weather(city: Literal["nyc", "sf"]):
    """Use this to get weather information."""
    if city == "nyc":
        return "It might be cloudy in nyc"
    elif city == "sf":
        return "It's always sunny in sf"


llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather]
graph = create_react_agent(llm, tools)

# Invoke the graph
result = graph.invoke(
    {"messages": [{"role": "user", "content": "what is the weather in tokyo?"}]}
)

# Example 4: Tracing CrewAI Agents

- Example https://mlflow.org/docs/latest/tracing/integrations/crewai 

In [None]:
!pip install crewai crewai_tools

In [None]:
import mlflow

# Turn on auto tracing by calling mlflow.crewai.autolog()
mlflow.crewai.autolog()

mlflow.set_experiment("CrewAI")

In [None]:
from crewai import Agent, Crew, Task
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
from crewai_tools import SerperDevTool, WebsiteSearchTool

from textwrap import dedent

content = "Users name is John. He is 30 years old and lives in San Francisco."
string_source = StringKnowledgeSource(
    content=content, metadata={"preference": "personal"}
)

search_tool = WebsiteSearchTool()

class TripAgents:
    def city_selection_agent(self):
        return Agent(
            role="City Selection Expert",
            goal="Select the best city based on weather, season, and prices",
            backstory="An expert in analyzing travel data to pick ideal destinations",
            tools=[
                search_tool,
            ],
            verbose=True,
        )

    def local_expert(self):
        return Agent(
            role="Local Expert at this city",
            goal="Provide the BEST insights about the selected city",
            backstory="""A knowledgeable local guide with extensive information
        about the city, it's attractions and customs""",
            tools=[search_tool],
            verbose=True,
        )

class TripTasks:
    def identify_task(self, agent, origin, cities, interests, range):
        return Task(
            description=dedent(
                f"""
                Analyze and select the best city for the trip based
                on specific criteria such as weather patterns, seasonal
                events, and travel costs. This task involves comparing
                multiple cities, considering factors like current weather
                conditions, upcoming cultural or seasonal events, and
                overall travel expenses.
                Your final answer must be a detailed
                report on the chosen city, and everything you found out
                about it, including the actual flight costs, weather
                forecast and attractions.

                Traveling from: {origin}
                City Options: {cities}
                Trip Date: {range}
                Traveler Interests: {interests}
            """
            ),
            agent=agent,
            expected_output="Detailed report on the chosen city including flight costs, weather forecast, and attractions",
        )

    def gather_task(self, agent, origin, interests, range):
        return Task(
            description=dedent(
                f"""
                As a local expert on this city you must compile an
                in-depth guide for someone traveling there and wanting
                to have THE BEST trip ever!
                Gather information about key attractions, local customs,
                special events, and daily activity recommendations.
                Find the best spots to go to, the kind of place only a
                local would know.
                This guide should provide a thorough overview of what
                the city has to offer, including hidden gems, cultural
                hotspots, must-visit landmarks, weather forecasts, and
                high level costs.
                The final answer must be a comprehensive city guide,
                rich in cultural insights and practical tips,
                tailored to enhance the travel experience.

                Trip Date: {range}
                Traveling from: {origin}
                Traveler Interests: {interests}
            """
            ),
            agent=agent,
            expected_output="Comprehensive city guide including hidden gems, cultural hotspots, and practical travel tips",
        )


class TripCrew:
    def __init__(self, origin, cities, date_range, interests):
        self.cities = cities
        self.origin = origin
        self.interests = interests
        self.date_range = date_range

    def run(self):
        agents = TripAgents()
        tasks = TripTasks()

        city_selector_agent = agents.city_selection_agent()
        local_expert_agent = agents.local_expert()

        identify_task = tasks.identify_task(
            city_selector_agent,
            self.origin,
            self.cities,
            self.interests,
            self.date_range,
        )
        gather_task = tasks.gather_task(
            local_expert_agent, self.origin, self.interests, self.date_range
        )

        crew = Crew(
            agents=[city_selector_agent, local_expert_agent],
            tasks=[identify_task, gather_task],
            verbose=True,
            memory=True,
            knowledge={
                "sources": [string_source],
                "metadata": {"preference": "personal"},
                "collection_name":"knowledge",
            },
        )

        result = crew.kickoff()
        return result


trip_crew = TripCrew("California", "Tokyo", "Dec 12 - Dec 20", "sports")


In [None]:
# Run the crew

result = trip_crew.run()
result.dict()

## Updated example 

In [None]:
!pip install crewai==0.108.0 crewai-tools duckduckgo-search langchain-openai

In [None]:
import os
from textwrap import dedent
from crewai import Agent, Crew, Task, Process
from crewai_tools import WebsiteSearchTool, DuckDuckGoSearchRunTool
# You might need specific LangChain components depending on your LLM setup
from langchain_openai import ChatOpenAI # Example for OpenAI or compatible APIs like Ollama

print("Libraries imported.")


In [None]:
# Turn on auto tracing by calling mlflow.crewai.autolog()
mlflow.crewai.autolog()

mlflow.set_experiment("CrewAI Agents")

In [None]:
# --- VERY IMPORTANT: Configure Your LLM ---
# Option A: Set OpenAI API Key as Environment Variable (Recommended for Simplicity)
# Make sure your OPENAI_API_KEY environment variable is set *before* launching Jupyter/VSCode.
# CrewAI will automatically pick it up if no specific 'llm' is passed to Agents/Crew.
# You can uncomment the line below to set it temporarily for this session *only*
# os.environ["OPENAI_API_KEY"] = "sk-YOUR_API_KEY_HERE"

# Option B: Configure for a Local or Specific LLM (Example using Ollama via ChatOpenAI)
# Ensure Ollama is running (e.g., `ollama run llama3`)
# ollama_llm = ChatOpenAI(
#     model="llama3", # Or whichever model you are running in Ollama
#     base_url="http://localhost:11434/v1",
#     api_key="NA" # Standard practice for Ollama via langchain-openai
# )
# print("LLM Configuration (Example - Ollama): Set up.")
# If using Option B, you'll need to pass `llm=ollama_llm` when creating Agents below.

# Check if the key is set (optional check for Option A)
api_key = os.environ.get("OPENAI_API_KEY")
if api_key:
    print("OPENAI_API_KEY found. Agents will use the default OpenAI LLM unless specified otherwise.")
else:
    print("WARNING: OPENAI_API_KEY environment variable not found.")
    print("Ensure an LLM is configured either via environment variables or by passing an 'llm' object to Agents.")

## Setup WebsiteSearchTool

In [None]:
# --- Cell 3: Instantiate Tools ---
search_tool = WebsiteSearchTool()
# If you were using Serper:
# from crewai_tools import SerperDevTool
# os.environ["SERPER_API_KEY"] = "YOUR_SERPER_KEY" # Set environment variable
# search_tool = SerperDevTool()

print("Tools instantiated (WebsiteSearchTool).")

In [None]:
# --- Cell 4: Define Agent Creation Class ---
class MLflowIntegrationAgents:
    def trend_scout_agent(self, llm_config=None): # Pass LLM config if not relying on default
        return Agent(
            role="AI Technology Trend Scout",
            goal=dedent(
                """Identify a specific, promising new tool, library, or technique
                within a given area of AI/ML technology."""
            ),
            backstory=dedent(
                """An expert researcher constantly scanning the horizon
                for emerging AI/ML technologies, publications, and popular
                open-source projects. You prioritize novelty and potential impact."""
            ),
            tools=[search_tool],
            allow_delegation=False,
            verbose=True,
            llm=llm_config # Pass the specific LLM object here if needed
        )

    def technical_analyst_agent(self, llm_config=None): # Pass LLM config if not relying on default
        return Agent(
            role="AI Technical Analyst",
            goal=dedent(
                """Analyze the technical capabilities, architecture, pros, cons,
                and primary use cases of a specific AI tool or technique."""
            ),
            backstory=dedent(
                """A meticulous engineer who dives deep into documentation,
                tutorials, and technical blogs to understand how technologies
                work under the hood. You focus on practical implementation details."""
            ),
            tools=[search_tool],
            allow_delegation=False,
            verbose=True,
            llm=llm_config # Pass the specific LLM object here if needed
        )

    def mlflow_integration_assessor_agent(self, llm_config=None): # Pass LLM config if not relying on default
        return Agent(
            role="MLflow Integration Assessor",
            goal=dedent(
                """Assess how a given AI tool/technique could be integrated
                into a standard MLflow workflow. Identify potential logging points,
                artifact types, customizability needs, or existing MLflow plugins."""
            ),
            backstory=dedent(
                """A seasoned MLOps engineer with deep expertise in MLflow.
                You understand the ML lifecycle and how various components
                can be tracked and managed using MLflow runs, artifacts, models,
                and parameters. You think practically about integration points."""
            ),
            tools=[search_tool], # May need search for finding MLflow plugins
            allow_delegation=False,
            verbose=True,
            llm=llm_config # Pass the specific LLM object here if needed
        )

print("MLflowIntegrationAgents class defined.")
# Example instantiation check (if using Option B from Cell 2):
# agent_creator = MLflowIntegrationAgents()
# scout = agent_creator.trend_scout_agent(llm_config=ollama_llm)
# print(scout) # Check agent creation

In [None]:
# --- Cell 5: Define Task Creation Class ---
class MLflowIntegrationTasks:
    # Task 1: No context needed as it's the first step
    def scout_task(self, agent, area_of_interest):
        return Task(
            description=dedent(
                f"""
                Identify one specific, noteworthy, and relatively new open-source tool,
                library, or technique related to '{area_of_interest}'.
                Provide its name and a brief (1-2 sentence) description of what it does.
                Focus on things gaining traction or offering novel capabilities.
                Avoid general concepts; find a concrete example.

                Example Areas: Vector Databases, LLM Serving Frameworks,
                               Data Versioning for ML, Explainable AI Libraries.

                Your final output MUST be the name of the tool/technique and the brief description ONLY.
            """
            ),
            agent=agent,
            expected_output="The name of a specific tool/library/technique and a 1-2 sentence description.",
            # cache=True # Optional: Cache the output of this task
        )

    # Task 2: Depends on the output of scout_task
    def analyze_task(self, agent, area_of_interest, context_task): # Added context_task parameter
        return Task(
            description=dedent(
                f"""
                Based on the tool/technique identified in the previous step for the area '{area_of_interest}',
                perform a technical analysis. Research its core features, how it works
                (high-level architecture if possible), main benefits, and potential drawbacks or limitations.
                Use web search to find its documentation, tutorials, or technical articles.

                Your final output must be a bulleted list summarizing:
                - Core Features
                - How it Works (Briefly)
                - Key Benefits
                - Potential Drawbacks/Limitations
            """
            ),
            agent=agent,
            expected_output="Bulleted list summarizing features, workings, benefits, and drawbacks.",
            context=[context_task] # Explicitly state dependency on the previous task
            # cache=True # Optional: Cache the output of this task
        )

    # Task 3: Depends on the output of analyze_task
    def assess_mlflow_integration_task(self, agent, context_task): # Added context_task parameter
        return Task(
            description=dedent(
                f"""
                Considering the identified tool/technique and its technical analysis from the previous steps:
                Assess how this tool/technique could be integrated with or tracked by MLflow.
                Think about the typical ML lifecycle (data prep, training, evaluation, deployment, monitoring).
                Specifically suggest:
                1.  What parameters related to this tool could be logged to MLflow?
                2.  What metrics could be tracked?
                3.  What kind of artifacts could be logged (e.g., config files, model files specific to the tool, evaluation plots)?
                4.  Are there any known MLflow plugins or standard integration patterns? (Perform a quick search if unsure).
                5.  What are the key challenges or considerations for integration?

                Your final output must be a report addressing these 5 points clearly.
                Focus solely on the MLflow integration aspect.
            """
            ),
            agent=agent,
            expected_output="A report detailing potential MLflow integration points (params, metrics, artifacts), known integrations, and challenges.",
            context=[context_task] # Explicitly state dependency on the previous task
        )

print("MLflowIntegrationTasks class defined.")

In [None]:
# --- Cell 6: Define the Crew Class (with MLflow additions) ---
import mlflow
import time
import json # Needed for logging dicts
from crewai import __version__ as crewai_version # Get CrewAI version

class MLflowIntegrationCrew:
    def __init__(self, area_of_interest):
        self.area_of_interest = area_of_interest
        # Determine LLM config (as before)
        self.llm_config = None # Default to env vars

        # Store agent/task definitions for logging
        self.agents_config = {}
        self.tasks_config = {}

    def _log_configs(self):
        """Logs agent and task configurations as artifacts."""
        print("Logging configurations to MLflow...")
        # Log agent configs
        for name, agent_obj in self.agents_config.items():
            # Convert Agent object to a dictionary (might need refinement based on Agent structure)
            # Simple example: extracting key attributes
            agent_dict = {
                "role": agent_obj.role,
                "goal": agent_obj.goal,
                "backstory": agent_obj.backstory,
                "tools": [tool.name for tool in agent_obj.tools] if agent_obj.tools else [],
                "llm": str(agent_obj.llm) if agent_obj.llm else "Default",
                "verbose": agent_obj.verbose,
                "allow_delegation": agent_obj.allow_delegation
            }
            mlflow.log_dict(agent_dict, f"agent_configs/{name}_config.json")

        # Log task configs
        for name, task_obj in self.tasks_config.items():
            task_dict = {
                "description": task_obj.description,
                "expected_output": task_obj.expected_output,
                "agent": task_obj.agent.role if task_obj.agent else "N/A", # Log agent role
                 # Context might be complex to serialize, log its task name/description if possible
                "context_task_descriptions": [ctx.description for ctx in task_obj.context] if task_obj.context else [],
            }
            mlflow.log_dict(task_dict, f"task_configs/{name}_config.json")

    def run(self):
        start_time = time.time()
        final_result = None
        status = "FAILURE" # Default status

        # --- MLflow Run Start ---
        # You might want to set experiment name outside the class or pass it in
        # mlflow.set_experiment("CrewAI MLflow Demo")
        with mlflow.start_run(run_name=f"CrewAI_Scout_{self.area_of_interest[:30]}") as run:
            print(f"MLflow Run started: {run.info.run_id}")

            # 1. Log Parameters & Tags
            print("Logging parameters and tags...")
            mlflow.log_param("area_of_interest", self.area_of_interest)
            mlflow.log_param("crew_process", "sequential") # Assuming sequential
            mlflow.log_param("crewai_version", crewai_version)
            mlflow.set_tag("crew_input_area", self.area_of_interest)
            mlflow.set_tag("crewai_version", crewai_version)
            # Add LLM info if available/consistent
            # llm_model_name = self.llm_config.model_name if self.llm_config else "Default"
            # mlflow.log_param("llm_model", llm_model_name)
            # mlflow.set_tag("llm_model", llm_model_name)

            # Instantiate agents & tasks, store their configs
            agent_creator = MLflowIntegrationAgents()
            task_creator = MLflowIntegrationTasks()

            self.agents_config = {
                "trend_scout": agent_creator.trend_scout_agent(self.llm_config),
                "tech_analyst": agent_creator.technical_analyst_agent(self.llm_config),
                "mlflow_assessor": agent_creator.mlflow_integration_assessor_agent(self.llm_config),
            }
            # Assign agents to local variables for clarity if needed
            trend_scout = self.agents_config["trend_scout"]
            tech_analyst = self.agents_config["tech_analyst"]
            mlflow_assessor = self.agents_config["mlflow_assessor"]

            # Define tasks sequentially and store configs
            scout_task = task_creator.scout_task(trend_scout, self.area_of_interest)
            analyze_task = task_creator.analyze_task(tech_analyst, self.area_of_interest, context_task=scout_task)
            assess_task = task_creator.assess_mlflow_integration_task(mlflow_assessor, context_task=analyze_task)

            self.tasks_config = {
                "scout_task": scout_task,
                "analyze_task": analyze_task,
                "assess_task": assess_task
            }

            # Log Agent/Task Counts
            mlflow.log_metric("agent_count", len(self.agents_config))
            mlflow.log_metric("task_count", len(self.tasks_config))
            mlflow.set_tag("crew_agents", "|".join(self.agents_config.keys()))

            # Log full configs as artifacts
            self._log_configs()

            # Form the crew
            crew = Crew(
                agents=list(self.agents_config.values()),
                tasks=list(self.tasks_config.values()),
                process=Process.sequential,
                verbose=True, # Keep verbose True to capture logs if needed
                # llm=self.llm_config # Optional crew-level LLM
            )

            # 2. Execute the Crew & Capture Output/Logs
            print("\nAttempting to kickoff the crew...")
            try:
                # --- Capture verbose output ---
                # This part can be tricky in standard notebooks.
                # Option A: Use a context manager if available (e.g., from libraries like `io`, `contextlib`)
                # Option B: Run as script and redirect stdout/stderr (easier)
                # Option C: Rely on parsing the 'final_result' if it contains verbose output (less common now)

                # Simple approach: run and hope final_result is useful, log separately if needed
                final_result = crew.kickoff()
                status = "SUCCESS"
                print("Crew kickoff successful.")

            except Exception as e:
                status = "FAILURE"
                final_result = f"Error Type: {type(e).__name__}\nError Details: {e}\n\nTraceback:\n{traceback.format_exc()}"
                print(f"\n--- Crew Execution Failed ---")
                print(final_result)
                # Optionally re-raise the exception if you want the notebook cell to fail
                # raise e
            finally:
                # 3. Log Results & Metrics
                end_time = time.time()
                execution_time = end_time - start_time
                print(f"Execution Time: {execution_time:.2f} seconds")
                print("Logging results and metrics...")

                mlflow.log_metric("execution_time_seconds", execution_time)
                mlflow.set_tag("crew_status", status)
                mlflow.log_metric("success", 1 if status == "SUCCESS" else 0)

                # Log final output as text artifact
                if isinstance(final_result, str):
                    mlflow.log_text(final_result, "final_output.txt")
                else:
                     # Try logging as JSON if it's dict-like, else convert to string
                    try:
                        mlflow.log_dict(dict(final_result), "final_output.json") # Requires result implement items()
                    except:
                         mlflow.log_text(str(final_result), "final_output.txt")

                # --- How to log intermediate steps/verbose log? ---
                # If running as a script redirecting output:
                # with open("execution_log.txt", "r") as f:
                #    mlflow.log_artifact("execution_log.txt")

                # Placeholder - Add logic here if you capture verbose logs some other way
                print("Placeholder: Add logic to capture and log verbose execution output if needed.")

                print(f"MLflow Run completed: {run.info.run_id}")

        return final_result, status # Return status along with result




### Run 

#### Suggested area Topics:

LLM Focused:

- "Tools for fine-tuning open-source LLMs" (e.g., Axolotl, Unsloth, libraries for PEFT)
- "Frameworks for evaluating LLM outputs" (e.g., Ragas, DeepEval, TruLens)
- "LLM Observability platforms" (e.g., LangSmith, Helicone, Weights & Biases integrations)
- "Prompt engineering & management toolkits" (e.g., Promptfoo, LangChain/LlamaIndex prompt templates, specialized libraries)
- "Quantization libraries for LLMs" (e.g., AutoGPTQ, bitsandbytes wrappers, specific toolkits)

RAG Focused:
- "Scalable Vector Databases for RAG" (e.g., Qdrant, Weaviate, Milvus alternatives or new entrants)
- "Frameworks implementing advanced RAG techniques" (e.g., Self-RAG, Corrective RAG, ReAct pattern libraries)
- "Evaluation frameworks specifically for RAG pipelines" (e.g., Ragas, TruLens focus on RAG metrics)

Agentic AI Focused:
- "Alternative AI Agent frameworks to LangChain/CrewAI" (e.g., Autogen, BabyAGI variants, new research frameworks)
- "Tools for creating and managing agent actions/tools" (Libraries simplifying tool definition and secure execution)
- "Frameworks for multi-agent system orchestration" (e.g., Autogen, specialized simulation environments)

General GenAI Dev Tools:
- "Orchestration tools for complex LLM/RAG pipelines" (e.g., Kestra, Dagster integrations for LLMs)
- "Synthetic data generation tools using LLMs" (Libraries focused on generating structured/unstructured data)

Tips for Choosing:
- Specificity: More specific topics often yield better results than very broad ones.
- Recency: Focus on areas where new tools are actively emerging.
- MLflow Relevance: While all can be tracked, areas like "Observability," "Evaluation," "Fine-tuning," and "Orchestration" have very direct conceptual links to MLflow's core purpose.


In [None]:
# --- Cell 8: Run the Crew and Display Results (Modified) ---
import traceback # Needed for logging exception tracebacks

print("Running the crew with MLflow Tracing...")
area = "Evaluation frameworks specifically for RAG pipelines"
integration_crew = MLflowIntegrationCrew(area_of_interest=area)

# Run and capture status
final_result, run_status = integration_crew.run()

print("\n\n########################")
print(f"## Crew Run Status: {run_status} ##")
print("########################\n")

print("Final Output (also logged to MLflow):")
print(final_result)

# You can now go to the MLflow UI (usually `mlflow ui` in terminal) to see the run details.

In [None]:
import os
import json
import time
import logging
import io
import sys
import contextlib

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)


class AIOpsResearchCrew:
    def __init__(self, task, ai_stack):
        """
        Initialize the crew with the task description and existing AI stack.
        
        Args:
            task (str): Description of the task requiring AI tools
            ai_stack (str): Comma-separated list of existing tools/frameworks used
        """
        self.task = task
        self.ai_stack = ai_stack
        self.run_id = None
        self.execution_log = io.StringIO()
        self.crew = None
        
        # Ensure output directory exists
        os.makedirs("output", exist_ok=True)

    def run(self):
        """Execute the research, analysis, and reporting process with MLflow tracking."""
        
        # --- UPDATE: Start MLflow run ---
        with mlflow.start_run(run_name=f"Tool_Research_{int(time.time())}") as mlflow_run:
            self.run_id = mlflow_run.info.run_id
            
            # --- UPDATE: Log parameters ---
            mlflow.log_param("task", self.task)
            mlflow.log_param("ai_stack", self.ai_stack)
            
            # --- UPDATE: Capture console output ---
            with contextlib.redirect_stdout(self.execution_log), contextlib.redirect_stderr(self.execution_log):
                try:
                    # Initialize agents
                    logger.info("Initializing agents...")
                    agents = AIOpsResearchAgents()
                    researcher = agents.researcher_agent()
                    analyst = agents.analyst_agent()
                    
                    # Initialize tasks
                    logger.info("Setting up tasks...")
                    tasks = AIOpsResearchTasks()
                    search_task = tasks.search_tools_task(researcher, self.task, self.ai_stack)
                    
                    # Create the crew
                    logger.info("Creating crew...")
                    crew = Crew(
                        agents=[researcher, analyst],
                        tasks=[search_task],
                        verbose=True,
                        process=Process.sequential,
                        memory=True
                    )
                    self.crew = crew
                    
                    # Add dependent tasks
                    analyze_task = tasks.analyze_tools_task(analyst, self.task, self.ai_stack)
                    crew.tasks.append(analyze_task)
                    
                    report_task = tasks.create_report_task(analyst, self.task, self.ai_stack)
                    crew.tasks.append(report_task)
                    
                    # Start the crew
                    logger.info("Starting crew execution...")
                    result = crew.kickoff()
  
                    # --- UPDATE: Log metrics ---
                    logger.info("Starting log metrics...")
                    if hasattr(crew, "usage_metrics") and crew.usage_metrics:
                        mlflow.log_metrics(json.loads(crew.usage_metrics.json()))

                    
                    # --- UPDATE: Log task artifacts if they exist ---
                    self._log_artifacts()

                    # --- UPDATE: Set success tag ---
                    if os.path.exists("output/tool_recommendation_report.md"):
                        mlflow.set_tag("status", "SUCCESS")
                    else:
                        mlflow.set_tag("status", "FAILED")
                    
                    return result
                
                # --- UPDATE: Log execution status and logs 
                except Exception as e:
                    logger.error(f"Error during crew execution: {str(e)}", exc_info=True)
                    mlflow.set_tag("status", "FAILED")
                    mlflow.log_metric("success", 0)
                    raise e
                
                finally:
                    # Log execution trace
                    logger.info("Logging execution trace...")
                    execution_log = self.execution_log.getvalue()
                    mlflow.log_text(execution_log, "execution_log.txt")

    def _log_artifacts(self):
        """Log task output artifacts to MLflow."""
        artifact_files = [
            ("output/tool_candidates.json", "Task 1: Tool Discovery"),
        ]
        
        for file_path, description in artifact_files:
            if os.path.exists(file_path):
                logger.info(f"Logging artifact: {file_path}")
                mlflow.log_artifact(file_path)
                
                # For JSON files, also log as parameters for easier viewing
                if file_path.endswith('.json'):
                    try:
                        with open(file_path, 'r') as f:
                            data = json.load(f)
                            
                        # Log tool names as parameters
                        tool_names = [tool.get('name', f"Tool {i+1}") for i, tool in enumerate(data)]
                        mlflow.log_param("discovered_tools", ", ".join(tool_names))

                    except Exception as e:
                        logger.warning(f"Error logging JSON data from {file_path}: {str(e)}")
            else:
                logger.warning(f"Artifact file not found: {file_path}")