## Prompting for Personalized Book Recommendations with Workflow

**Objective:**

This tutorial combines effective prompting techniques with an end-to-end workflow for personalized book recommendations. You will learn how to design a recommendation system using an LLM that recommends books based on a user's preferences.

#### End-to-End Workflow Design

The workflow for personalized book recommendations includes:

- Input Data Preparation: Collect user preferences and book metadata.
- Prompt Construction: Design prompts to elicit recommendations.
- LLM Inference: Use an LLM to generate recommendations.
- Post-Processing: Refine results using additional filtering or embeddings.
- Evaluation: Assess the relevance and diversity of recommendations.
#### Set Up the Azure ML Environment

In [None]:
# Install Azure ML SDK if not already installed
pip install azureml-core azureml-pipeline

#### Define Data Preparation Step

Process user profile and book metadata.

In [None]:
%%writefile scripts/prep_step.py
import argparse
import pandas as pd

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--user_profile", type=str, help="Path to user profile TSV")
    parser.add_argument("--book_metadata", type=str, help="Path to book metadata CSV")
    parser.add_argument("--output", type=str, help="Path to save processed data")
    args = parser.parse_args()

    # Load and process data
    user_profile = pd.read_csv(args.user_profile, sep='\t')
    book_metadata = pd.read_csv(args.book_metadata)

    # Save preprocessed data
    output_path = args.output + "/processed_data.csv"
    book_metadata.to_csv(output_path, index=False)

if __name__ == "__main__":
    main()

#### Define LLM Inference Step

Generate recommendations using prompts.


In [None]:
%%writefile scripts/inference_step.py
import argparse
from langchain import OpenAI
import pandas as pd

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--input", type=str, help="Path to processed data")
    parser.add_argument("--output", type=str, help="Path to save recommendations")
    args = parser.parse_args()

    # Load processed data
    data = pd.read_csv(args.input)

    # LLM inference
    llm = OpenAI(model="gpt-4", max_tokens=150)
    data['recommendations'] = data['description'].apply(lambda x: llm(f"Recommend books based on: {x}"))

    # Save recommendations
    data.to_csv(args.output + "/recommendations.csv", index=False)

if __name__ == "__main__":
    main()


#### Define Evaluation Step

Assess the relevance of recommendations.

In [None]:
%%writefile scripts/evaluation_step.py
import argparse
from sentence_transformers import SentenceTransformer, util
import pandas as pd

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--input", type=str, help="Path to recommendations")
    parser.add_argument("--output", type=str, help="Path to save evaluation results")
    args = parser.parse_args()

    # Load recommendations
    data = pd.read_csv(args.input)

    # Evaluate using embeddings
    model = SentenceTransformer('all-MiniLM-L6-v2')
    data['scores'] = data['recommendations'].apply(lambda x: util.cos_sim(model.encode(x), model.encode("science fiction, space exploration")))

    # Save evaluation results
    data.to_csv(args.output + "/evaluation_results.csv", index=False)

if __name__ == "__main__":
    main()


#### Azure ML Pipeline with Script Files
**Define the Pipeline Steps**

In [None]:
from azureml.core import Workspace, Experiment, Environment
from azureml.pipeline.core import Pipeline, PipelineData
from azureml.pipeline.steps import PythonScriptStep
from azureml.core.compute import ComputeTarget

# Set up Azure ML workspace and environment
ws = Workspace.from_config()
compute_target = ComputeTarget(workspace=ws, name="your-compute-cluster")
env = Environment.from_conda_specification(name="pipeline-env", file_path="environment.yml")

# Define data outputs
prep_output = PipelineData("prep_output", datastore=ws.get_default_datastore())
inference_output = PipelineData("inference_output", datastore=ws.get_default_datastore())
evaluation_output = PipelineData("evaluation_output", datastore=ws.get_default_datastore())

# Step 1: Data Preparation
prep_step = PythonScriptStep(
    name="Data Preparation",
    script_name="scripts/prep_step.py",
    arguments=["--user_profile", "user_profile.tsv", "--book_metadata", "book_metadata.csv", "--output", prep_output],
    outputs=[prep_output],
    compute_target=compute_target,
    runconfig=env,
    allow_reuse=True,
)

# Step 2: LLM Inference
inference_step = PythonScriptStep(
    name="LLM Inference",
    script_name="scripts/inference_step.py",
    arguments=["--input", prep_output, "--output", inference_output],
    inputs=[prep_output],
    outputs=[inference_output],
    compute_target=compute_target,
    runconfig=env,
    allow_reuse=True,
)

# Step 3: Evaluation
evaluation_step = PythonScriptStep(
    name="Evaluation",
    script_name="scripts/evaluation_step.py",
    arguments=["--input", inference_output, "--output", evaluation_output],
    inputs=[inference_output],
    outputs=[evaluation_output],
    compute_target=compute_target,
    runconfig=env,
    allow_reuse=True,
)


** Combine Steps into a Pipeline**

In [None]:
# Create pipeline
pipeline = Pipeline(workspace=ws, steps=[prep_step, inference_step, evaluation_step])
pipeline.validate()

# Submit pipeline
experiment = Experiment(workspace=ws, name="book-recommendation-pipeline")
pipeline_run = experiment.submit(pipeline)
pipeline_run.wait_for_completion(show_output=True)
