# 🏃‍♀️ クイックスタート

Weaveの機能を理解していきましょう！

# <a href="https://colab.research.google.com/drive/1bdymP7p7d4z7izsS-PhMUxXcD38p9Hqr" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Google Colabで開く"/></a>

## 🪄 `weave`ライブラリのインストールとログイン


ライブラリをインストールし、アカウントにログインすることから始めましょう。

この例では、W&B Inferenceを使用するため、Public Cloud環境のWANDB_API_KEYが必要です。Dedicated CloudやOnpremiseを利用されている方は、Public Cloudのkeyも用意してください。

In [None]:
import os
import wandb
import weave
import openai
import json
#========================================
# 環境変数を適切に設定してください
#========================================
# os.environ["WANDB_BASE_URL"] = "https://api.wandb.ai" # 

wandb.login()
PROJECT = "wandb-japan/weave-handson" # handsonで利用するprojectです。entity(team)のところをご自身のteamに置き換えてください。
weave.init(PROJECT)

[34m[1mwandb[0m: Currently logged in as: [33mkeisuke-kamata[0m ([33mwandb-smle[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin
[36m[1mweave[0m: wandb version 0.22.2 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: weave version 0.52.9 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install weave --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: keisuke-kamata.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-japan/weave-handson-20251015/weave


<weave.trace.weave_client.WeaveClient at 0x11bb83a10>

# 関数の入力と出力をトラッキング

Weaveにより、ユーザーは関数呼び出しを、コード、入力、出力、さらにはLLMトークンとコストまでトラッキングできます。以下のセクションでは次の内容をカバーします：

* カスタム関数
* ベンダー統合
* ネストした関数呼び出し
* エラートラッキング

## カスタム関数のトラッキング

トラッキングしたい関数に@weave.opデコレータを追加します

In [2]:
@weave.op()
def echo(user_input):
    return user_input + " " + user_input

result = echo("hello")
print(result)

hello hello


[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199ebff-446d-7c30-a8d4-d8c5fcad5ce9
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199ebff-4b6a-7651-aaf1-81852f492acc
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199ebff-50fc-7ef7-b8d5-fe397229a659
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199ebff-616d-7a21-b2e6-9533f67c1df4


上記の👆 wandbリンクをクリックすることで、インタラクティブダッシュボードを見つけることができます。

`weave.op`を追加して関数を呼び出した後、リンクにアクセスして、プロジェクト内でトラッキングされているのを確認してください。

💡 コードを自動的にトラッキングしています。コードタブをご覧ください！

## Integrationを利用したトラッキング（W&B Inference、OpenAI、Anthropic、Mistralなど）

ここでは、`W&B Inference`への全ての呼び出しを自動的にトラッキングしています。Weaveは、多くのLLMライブラリを@weaveなしで自動的にトラッキングしてくれます。

In [3]:
# LLM APIの初期化
use_openai = True # OpenAIのAPIを利用する場合はTrue, W&B Inferenceを利用する場合はFalseにしてください。

if use_openai:
    ## OpenAIのAPIを利用する場合
    #os.environ["OPENAI_API_KEY"] = "" #OpenAIのAPIを利用する場合は、OpenAIのAPI Keyを入力してください。
    client = openai.OpenAI(
        api_key=os.getenv("OPENAI_API_KEY"),
    )
    model_name="gpt-4.1-2025-04-14"
else:
    ## W&B Inferenceを利用する場合
    #os.environ["WANDB_API_KEY_PUBLIC_CLOUD"] = ""  # Public cloudユーザーはWANDB_API_KEYと同じ値を使用して問題ないです。Dedicated Cloudを利用し、推論だけPublic Cloudを利用する場合などは、Public CloudのAPI Keyを入力してください。
    INFERENCE_PROJECT=PROJECT # Inferenceの利用料トラックのために利用するProjectです。WeaveのPROJECTと一致しない場合（Dedicated Cloudを利用し、推論だけPublic Cloudを利用する場合など）は、public cloudの中のentityとteamの組み合わせを入れてください。
    client = openai.OpenAI(
        base_url='https://api.inference.wandb.ai/v1',
        api_key=os.getenv("WANDB_API_KEY_PUBLIC_CLOUD") or os.getenv("WANDB_API_KEY"),
        project=INFERENCE_PROJECT,
    )
    model_name="openai/gpt-oss-20b"

In [4]:
# Trace the model call in Weave
def run_chat():
    response = client.chat.completions.create(
        model=model_name,
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "ジョークを言ってください。"}
        ],
    )
    return response.choices[0].message.content

# Run and log the traced call
output = run_chat()
print(output)

もちろん！では、一つどうぞ。

パンダがレストランで食事をしました。食べ終わった後、急に立ち上がってお金も払わずに帰ろうとしました。店員が慌てて「お客様、お会計は？」と聞くと、パンダは言いました。

「ごめん、持ちあい（持ち合い＝持ち合わせ）がないんだ。」

いかがでしょうか？


## Nested関数のトラッキング

基本を確認したので、上記のすべてを組み合わせて、Nested関数をトラッキングしてみましょう。


In [5]:
@weave.op()
def correct_grammar(user_input):
    echoed_input = echo(user_input)
    response = client.chat.completions.create(
        model=model_name,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant. Please create a song by following the user input.",
            },
            {"role": "user", "content": echoed_input},
        ],
        temperature=0,
    )
    return response.choices[0].message.content


result = correct_grammar("hello")
print(result)

Hello, hello, can you hear me now?  
Waves of words are floating through the crowd  
Hello, hello, let’s begin again  
A brand new story waiting to be penned  

(Chorus)  
Hello, hello, let’s light up the night  
With every greeting, we’re shining so bright  
Hello, hello, let’s see where we go  
With every “hello,” we open the door  

Verse 2:  
Hello, hello, to the world outside  
Smiles and laughter, nowhere to hide  
Hello, hello, let’s make it clear  
Every “hello” brings someone near  

(Chorus)  
Hello, hello, let’s light up the night  
With every greeting, we’re shining so bright  
Hello, hello, let’s see where we go  
With every “hello,” we open the door  

(Bridge)  
Every “hello” is a chance to start  
A little kindness, a work of art  
So say it loud, let your voice flow  
The world gets warmer with every “hello”  

(Chorus)  
Hello, hello, let’s light up the night  
With every greeting, we’re shining so bright  
Hello, hello, let’s see where we go  
With every “hello,” we 

## エラーのトラッキング

コードがクラッシュするたびに、weaveは問題の原因をハイライトします。LLMレスポンスからデータを解析する際に時々発生するJSONパースエラーなどを見つけるのに特に有用です。

In [6]:
@weave.op()
def correct_grammar(user_input):
    echoed_input = echo(user_input)
    response = client.chat.completions.create(
        model=model_name,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant. Please create a song by following the user input.",
            },
            {"role": "user", "content": echoed_input},
        ],
        temperature=0,
        response_format={"type": "json_object"},
    )
    return json.loads(response.choices[0].message.content)


result = correct_grammar("hello")

BadRequestError: Error code: 400 - {'error': {'message': "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}}

# Traceの高度なTips


### サンプリングレートの制御

@weave.opデコレータのtracing_sample_rateパラメータを設定することで、opの呼び出しがトレースされる頻度を制御できます。これは、呼び出しのサブセットのみをトレースする必要がある高頻度opに有用です。

サンプリングレートはルート呼び出しにのみ適用されることに注意してください。opにサンプルレートがあっても、最初に別のopによって呼び出された場合、そのサンプリングレートは無視されます。

In [None]:
@weave.op(tracing_sample_rate=0.1)  # Only trace ~10% of calls
def high_frequency_op(x: int) -> int:
    return x + 1

@weave.op(tracing_sample_rate=1.0)  # Always trace (default)
def always_traced_op(x: int) -> int:
    return x + 1

### 呼び出し表示名

In [None]:
# Decorate your function
@weave.op
def my_function(name: str):
    return f"Hello, {name}!"

# Call your function -- Weave will automatically track inputs and outputs
print(my_function("World"))

In [None]:
# 1st method
result = my_function("World", __weave={"display_name": "My Custom Display Name"})

In [None]:
# 2nd method
result, call = my_function.call("World")
call.set_display_name("My Custom Display Name")


In [None]:
# 3rd method
@weave.op(call_display_name="My Custom Display Name")
def my_function(name: str):
    return f"Hello, {name}!"

my_function("World")


### PIIの編集

一部の組織では、大規模言語モデル（LLM）ワークフローで名前、電話番号、メールアドレスなどの個人識別情報（PII）を処理します。このデータをWeights & Biases（W&B）Weaveに保存することは、コンプライアンスとセキュリティのリスクをもたらします。

機密データ保護機能により、トレースがWeaveサーバーに送信される前に、個人識別情報（PII）を自動的に編集できます。この機能はMicrosoft PresidioをWeave Python SDKに統合しており、SDK レベルで編集設定を制御できることを意味します。

[詳細ドキュメント](https://weave-docs.wandb.ai/guides/tracking/redact-pii)

# オブジェクトのトラッキング

Weaveでは、システムプロンプトや使用しているモデル、データセットなどのアセットを`weave.Objects`内でバージョン管理できます。

## プロンプトトラッキング

In [10]:
# StringPrompt1
system_prompt = weave.StringPrompt("You are a pirate")
weave.publish(system_prompt, name="pirate_prompt")

response = client.chat.completions.create(
  model=model_name,
  messages=[
    {
      "role": "system",
      "content": system_prompt.format()
    },
    {
      "role": "user",
      "content": "Explain general relativity in one paragraph."
    }
  ],
)

[36m[1mweave[0m: 📦 Published to https://wandb.ai/wandb-japan/weave-handson-20251015/weave/objects/pirate_prompt/versions/sShdJdKADGGYJiZ85UxB5wOJprgdiSwLH7VXMYxrHSA


In [11]:
# StringPrompt2
system_prompt = weave.StringPrompt("Talk like a pirate. I need to know I'm listening to a pirate.")
weave.publish(system_prompt, name="pirate_prompt")

response = client.chat.completions.create(
  model=model_name,
  messages=[
    {
      "role": "system",
      "content": system_prompt.format()
    },
    {
      "role": "user",
      "content": "Explain general relativity in one paragraph."
    }
  ],
)


[36m[1mweave[0m: 📦 Published to https://wandb.ai/wandb-japan/weave-handson-20251015/weave/objects/pirate_prompt/versions/GDp90qF5SsaoFgeMqEliU2e99AXxKDZwBub08BVipRQ


In [12]:
# MessagesPrompt1
prompt = weave.MessagesPrompt([
    {
        "role": "system",
        "content": "You are a stegosaurus, but don't be too obvious about it."
    },
    {
        "role": "user",
        "content": "What's good to eat around here?"
    }
])
weave.publish(prompt, name="dino_prompt")

response = client.chat.completions.create(
  model="gpt-4o",
  messages=prompt.format(),
)

[36m[1mweave[0m: 📦 Published to https://wandb.ai/wandb-japan/weave-handson-20251015/weave/objects/dino_prompt/versions/rXZ6qrk5SFNdXIkGehiVXrv9PYX62PbKDyoYHlahoMo


In [13]:
# parameterizing prompts
prompt = weave.StringPrompt("Solve the equation {equation}")
weave.publish(prompt, name="calculator_prompt")


response = client.chat.completions.create(
  model=model_name,
  messages=[
    {
      "role": "user",
      "content": prompt.format(equation="1 + 1 = ?")
    }
  ],
)

[36m[1mweave[0m: 📦 Published to https://wandb.ai/wandb-japan/weave-handson-20251015/weave/objects/calculator_prompt/versions/YK3NAzaQ0fXYTJjNPlf7JITesCRt3hIGcb2aW1mWOy0


## モデルトラッキング


In [14]:
class WandBInferenceGrammarCorrector(weave.Model):
    # プロパティは完全にユーザー定義
    model_name: str
    system_message: str

    @weave.op()
    def predict(self, user_input):

        response = client.chat.completions.create(
            model=self.model_name,
            messages=[
                {"role": "system", "content": self.system_message},
                {"role": "user", "content": user_input},
            ],
        )
        return response.choices[0].message.content

corrector = WandBInferenceGrammarCorrector(
    model_name=model_name,
    system_message="You are a grammar checker, correct the following user input.",
)


result = corrector.predict("     That was so easy, it was a piece of pie!       ")
print(result)

That was so easy; it was a piece of cake!


## データセットトラッキング


In [15]:
dataset = weave.Dataset(
    name="grammar-correction",
    rows=[
        {
            "user_input": "   That was so easy, it was a piece of pie!   ",
            "expected": "That was so easy, it was a piece of cake!",
        },
        {"user_input": "  I write good   ",
         "expected": "I write well"},
        {
            "user_input": "  GPT-3 is smartest AI model.   ",
            "expected": "GPT-3 is the smartest AI model.",
        },
    ],
)

weave.publish(dataset)

[36m[1mweave[0m: 📦 Published to https://wandb.ai/wandb-japan/weave-handson-20251015/weave/objects/grammar-correction/versions/Td1iXzvI1gFHlSq3VNyYytJBnzfzDgAyDaQWHY6X3bo


ObjectRef(entity='wandb-japan', project='weave-handson-20251015', name='grammar-correction', _digest='Td1iXzvI1gFHlSq3VNyYytJBnzfzDgAyDaQWHY6X3bo', _extra=())

## Retrieve Published Objects & Ops


In [None]:
ref_url = ""
prompt = weave.ref(ref_url).get()

print(prompt)

MessagesPrompt(name=None, description=None, messages=WeaveList([{'role': 'system', 'content': "You are a stegosaurus, but don't be too obvious about it."}, {'role': 'user', 'content': "What's good to eat around here?"}]))


# オフライン評価



## 方法1: [標準メソッド](https://weave-docs.wandb.ai/guides/core-types/evaluations)
予測と評価の両方をサンプルごとに実行し、評価をします。

In [None]:
from weave import Evaluation, Model
import weave
import asyncio
import openai
import os


examples = [
    {"question": "What is the capital of France?", "expected": "Paris"},
    {"question": "Who wrote 'To Kill a Mockingbird'?", "expected": "Harper Lee"},
    {"question": "What is the square root of 64?", "expected": "8"},
]

@weave.op()
def exact_match_scorer(expected: str, output: dict) -> dict:
    """Exact string matching evaluation"""
    generated = output.get('generated_text', '')
    return {'exact_match': expected.lower().strip() == generated.lower().strip()}

@weave.op()
def contains_answer_scorer(expected: str, output: dict) -> dict:
    """Check if the expected answer is contained in the generated text"""
    generated = output.get('generated_text', '').lower()
    expected_lower = expected.lower()
    return {'contains_answer': expected_lower in generated}

class OpenAIQAModel(Model):
    model_name: str = "gpt-4o-mini"
    temperature: float = 0.3
    max_tokens: int = 100

    @weave.op()
    def predict(self, question: str):
        client = openai.OpenAI()
        """Generate answer using OpenAI API"""
        try:
            response = client.chat.completions.create(
                model=self.model_name,
                messages=[
                    {
                        "role": "system",
                        "content": "You are a helpful assistant that provides accurate and concise answers to questions."
                    },
                    {
                        "role": "user", 
                        "content": f"Question: {question}"
                    }
                ],
                temperature=self.temperature,
                max_tokens=self.max_tokens
            )
            
            answer = response.choices[0].message.content.strip()
            return {
                'generated_text': answer,
                'model': self.model_name,
                'question': question
            }
            
        except Exception as e:
            print(f"Error calling OpenAI API: {e}")
            return {
                'generated_text': f"Error: {str(e)}",
                'model': self.model_name,
                'question': question
            }

# Create model and evaluation
model = OpenAIQAModel()
evaluation = Evaluation(
    dataset=examples, 
    scorers=[exact_match_scorer, contains_answer_scorer]
)

# Run evaluation
asyncio.run(evaluation.evaluate(model))

[36m[1mweave[0m: wandb version 0.22.2 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: keisuke-kamata.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-japan/weave-handson-20251015/weave
[36m[1mweave[0m: Evaluated 1 of 3 examples
[36m[1mweave[0m: Evaluated 2 of 3 examples
[36m[1mweave[0m: Evaluated 3 of 3 examples
[36m[1mweave[0m: Evaluation summary {
[36m[1mweave[0m:   "match_score": {
[36m[1mweave[0m:     "match": {
[36m[1mweave[0m:       "true_count": 0,
[36m[1mweave[0m:       "true_fraction": 0.0
[36m[1mweave[0m:     }
[36m[1mweave[0m:   },
[36m[1mweave[0m:   "model_latency": {
[36m[1mweave[0m:     "mean": 0.005383412043253581
[36m[1mweave[0m:   }
[36m[1mweave[0m: }
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e56d-0874-7b64-a215-7688af8ce2c6
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-jap

## 方法2: [EvaluationLogger](https://weave-docs.wandb.ai/guides/evaluation/evaluation_logger)
バッチ予測が適用できます。

In [None]:
import weave
from weave import EvaluationLogger

# Initialize the logger with optional metadata
eval_logger = EvaluationLogger(
    model="my_local_model",
    dataset="my_dataset"
)

# Example input data
eval_samples = [
    {'inputs': {'a': 1, 'b': 2}, 'expected': 3},
    {'inputs': {'a': 2, 'b': 3}, 'expected': 5},
    {'inputs': {'a': 3, 'b': 4}, 'expected': 7},
]

# Local model logic: simply add the numbers
@weave.op
def user_model(a: int, b: int) -> int:
    return a + b
# Example model logic using OpenAI
@weave.op
def user_model(a: int, b: int) -> int:
    oai = openai.OpenAI()
    response = oai.chat.completions.create(
        messages=[{"role": "user", "content": f"What is {a}+{b}?"}],
        model="gpt-4o-mini"
    )
    # Use the response in some way (here we just return a + b for simplicity)
    return a + b

# Iterate through examples, predict, and log
for sample in eval_samples:
    inputs = sample["inputs"]
    model_output = user_model(**inputs) # Pass inputs as kwargs

    # Log the prediction input and output
    pred_logger = eval_logger.log_prediction(
        inputs=inputs,
        output=model_output
    )

    # Calculate and log a score for this prediction
    expected = sample["expected"]
    correctness_score = model_output == expected
    pred_logger.log_score(
        scorer="correctness", # Simple string name for the scorer
        score=correctness_score
    )

    # Finish logging for this specific prediction
    pred_logger.finish()

# Log a final summary for the entire evaluation.
# Weave auto-aggregates the 'correctness' scores logged above.
summary_stats = {"subjective_overall_score": 0.8}
eval_logger.log_summary(summary_stats)

print("Evaluation logging complete. View results in the Weave UI.")

ModuleNotFoundError: No module named 'weave.flow.eval_imperative'

# オンライン評価

# フィードバック

In [19]:
client = weave.init(PROJECT)
call = client.get_call("0199e574-02a6-7444-80f2-40b5e8fa4e06") #@param

# Adding an emoji reaction
call.feedback.add_reaction("👍")

# Adding a note
call.feedback.add_note("this is a note")

# Adding custom key/value pairs.
# The first argument is a user-defined "type" string.
# Feedback must be JSON serializable and less than 1 KB when serialized.
call.feedback.add("correctness", { "value": 5 })

[36m[1mweave[0m: wandb version 0.22.2 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: keisuke-kamata.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-japan/weave-handson-20251015/weave


'0199e57a-4aa2-7e86-9d4d-febd8f1e8b45'

# Scorers

In [20]:
import weave
from weave import Scorer


class LengthScorer(Scorer):
    @weave.op
    def score(self, output: str) -> dict:
        """A simple scorer that checks output length."""
        return {
            "length": len(output),
            "is_short": len(output) < 100
        }

@weave.op
def generate_text(prompt: str) -> str:
    return "Hello, world!"

# Get both result and Call object
result, call = generate_text.call("Say hello")

# Now you can apply scorers
await call.apply_scorer(LengthScorer())

ApplyScorerSuccess(result={'length': 13, 'is_short': True}, score_call=Call(_op_name=<Future at 0x11992c990 state=running>, trace_id='0199e580-1316-79ab-b0cf-985fe749ce91', project_id='wandb-japan/weave-handson-20251015', parent_id=None, inputs={'self': ObjectRef(entity='wandb-japan', project='weave-handson-20251015', name='LengthScorer', _digest=<Future at 0x119943e10 state=pending>, _extra=()), 'output': 'Hello, world!'}, id='0199e580-1317-7bfe-b9f8-3fea239bf731', output={'length': 13, 'is_short': True}, exception=None, summary={'status_counts': {<TraceStatus.SUCCESS: 'success'>: 1, <TraceStatus.ERROR: 'error'>: 0}}, _display_name=None, attributes=AttributesDict({'weave': {'python': {'type': 'function'}, 'client_version': '0.52.9', 'source': 'python-sdk', 'sys_version': '3.11.13 (main, Jun  3 2025, 18:38:25) [Clang 17.0.0 (clang-1700.0.13.3)]', 'os_name': 'Darwin', 'os_version': 'Darwin Kernel Version 24.6.0: Mon Aug 11 21:16:34 PDT 2025; root:xnu-11417.140.69.701.11~1/RELEASE_ARM64_

[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e580-130f-7664-9773-3915adba8857
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e581-cbc5-7735-8218-4420efa34b6d
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e581-cbd0-7488-8ed0-ee2c66fd64ab
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e581-cbca-7efa-9ba4-0f2ba4456b0f


## ガードレールとしてのScorersの使用
ガードレールは、LLM出力がユーザーに到達する前に実行される安全チェックとして機能します。実用的な例を以下に示します：

In [21]:
import weave
from weave import Scorer
import asyncio
import nest_asyncio  # Required for running asyncio in Google Colab

# Apply nest_asyncio to avoid event loop issues in Google Colab
nest_asyncio.apply()

# ==== 1. Define text generation function ====
@weave.op
def generate_text(prompt: str) -> str:
    """Simulated LLM text generation (basic logic)."""
    responses = {
        "hello": "Hello! How can I help you?",
        "bad": "You are terrible!",  # Example of toxic response
        "good": "You are wonderful!",
    }
    return responses.get(prompt.lower(), "I don't understand your request.")

# ==== 2. Define the Toxicity Scorer ====
class ToxicityScorer(Scorer):
    @weave.op
    def score(self, output: str) -> dict:
        """
        Evaluate the generated content for toxic language.
        """
        toxic_words = {"terrible", "hate", "stupid"}  # Simple keyword-based detection
        flagged = any(word in output.lower() for word in toxic_words)

        return {
            "flagged": flagged,
            "reason": "Detected toxic language" if flagged else None
        }

# ==== 3. Function to generate safe responses ====
async def generate_safe_response(prompt: str) -> str:
    # Generate text using LLM
    result, call = generate_text.call(prompt)

    # Apply toxicity scoring
    safety = await call.apply_scorer(ToxicityScorer())

    # If flagged as toxic, return a warning message
    if safety.result["flagged"]:
        return f"I cannot generate that content: {safety.result['reason']}"

    return result

# ==== 4. Run test cases ====
async def main():
    prompts = ["hello", "bad", "good"]

    for prompt in prompts:
        response = await generate_safe_response(prompt)
        print(f"Prompt: {prompt}\nResponse: {response}\n")

# Run the async function in Google Colab
loop = asyncio.get_event_loop()
loop.run_until_complete(main())


Prompt: hello
Response: Hello! How can I help you?

Prompt: bad
Response: I cannot generate that content: Detected toxic language

Prompt: good
Response: You are wonderful!



## モニターとしてのScorersの使用

モニターは、操作をブロックすることなく、時間の経過とともに品質メトリクスを追跡するのに役立ちます。これは以下に有用です：

* 品質トレンドの特定
* モデルドリフトの検出
* モデル改善のためのデータ収集





In [None]:
import weave
from weave import Scorer
import json
import xml.etree.ElementTree as ET
import random
import asyncio
import nest_asyncio  # Required for Google Colab

# Apply nest_asyncio to avoid event loop issues in Google Colab
nest_asyncio.apply()

# ==== 1. Define Text Generation Function ====
@weave.op
def generate_text(prompt: str) -> str:
    """Simulated LLM response generation."""
    if prompt.lower() == "json":
        return '{"message": "Hello, world!"}'  # Valid JSON
    elif prompt.lower() == "xml":
        return "<message>Hello, world!</message>"  # Valid XML
    else:
        return "Generated response..."

# ==== 2. Custom Scorer for JSON Validation ====
class CustomJSONScorer(Scorer):
    @weave.op
    def score(self, output: str) -> dict:
        """Check if the output is valid JSON."""
        try:
            json.loads(output)
            return {"valid_json": True}
        except json.JSONDecodeError:
            return {"valid_json": False}

# ==== 3. Custom Scorer for XML Validation ====
class CustomXMLScorer(Scorer):
    @weave.op
    def score(self, output: str) -> dict:
        """Check if the output is valid XML."""
        try:
            ET.fromstring(output)
            return {"valid_xml": True}
        except ET.ParseError:
            return {"valid_xml": False}

# ==== 4. Function to Generate Response with Monitoring ====
async def generate_with_monitoring(prompt: str) -> str:
    """
    Generates a response and applies monitoring (randomly 10% of the time).
    """
    # Generate text and capture tracking info
    result, call = generate_text.call(prompt)

    # Sample monitoring (only apply scorers to 10% of calls)
    if random.random() < 0.1:
        # Apply manual scorers asynchronously
        json_score = await call.apply_scorer(CustomJSONScorer())
        xml_score = await call.apply_scorer(CustomXMLScorer())

        print(f"Monitoring Applied - JSON Valid: {json_score.result['valid_json']}, XML Valid: {xml_score.result['valid_xml']}")

    return result

# ==== 5. Run Test Cases ====
async def main():
    prompts = ["json", "xml", "text"]

    for prompt in prompts:
        response = await generate_with_monitoring(prompt)
        print(f"Prompt: {prompt}\nResponse: {response}\n")

# Run the async function in Google Colab
loop = asyncio.get_event_loop()
loop.run_until_complete(main())



# メディア

## 画像

In [22]:
import weave
from openai import OpenAI
import requests
from PIL import Image

weave.init(project_name=PROJECT)
#os.environ["OPENAI_API_KEY"] = ""
client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY")
)

@weave.op()
def generate_image(prompt: str) -> Image:
    response = client.images.generate(
        model="dall-e-3",
        prompt=prompt,
        size="1024x1024",
        quality="standard",
        n=1,
    )
    image_url = response.data[0].url
    image_response = requests.get(image_url, stream=True)
    image = Image.open(image_response.raw)

    # return a PIL.Image.Image object to be logged as an image
    return image

image = generate_image("a cat with a pumpkin hat")

[36m[1mweave[0m: wandb version 0.22.2 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: keisuke-kamata.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-japan/weave-handson-20251015/weave
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e585-4cbb-75d4-8d3e-1b6059c7fe48


## 音声

In [23]:
import weave
from openai import OpenAI
import wave

weave.init(project_name=PROJECT)
#os.environ["OPENAI_API_KEY"] = ""
client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY")
)
@weave.op
def make_audio_file_streaming(text: str) -> wave.Wave_read:
    with client.audio.speech.with_streaming_response.create(
        model="tts-1",
        voice="alloy",
        input=text,
        response_format="wav",
    ) as res:
        res.stream_to_file("output.wav")

    # return a wave.Wave_read object to be logged as audio
    return wave.open("output.wav")

make_audio_file_streaming("Hello, how are you? What did you do yesterday?")

[36m[1mweave[0m: Flushing 2 pending tasks...


→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 0.0% (0/2)
→ Flushing tasks: 50.0% (1/2)
→ Flushing tasks: 50.0% (1/2)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 33.3% (1/3)
→ Flushing tasks: 66.7% (2/3)
→ Flushing tasks: 66.7% (2/3)
→ Flushing tasks: 66.7% (2/3)
→ Flushing tasks: 66.7% (2/3)
→ Flushing tasks: 66.7% (2/3)
→ Flushing tasks: 100.0% (1/1)
✓ Flushing tasks: 100.0% (1/1)


[36m[1mweave[0m: wandb version 0.22.2 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: keisuke-kamata.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-japan/weave-handson-20251015/weave
[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e585-9f0b-7930-aba3-e764b242dab1


<wave.Wave_read at 0x11995af10>

## 動画

In [24]:
import time
from google import genai
from google.genai import types
from moviepy.editor import VideoFileClip, ColorClip, VideoClip
import weave

weave.init(PROJECT)

@weave.op()
def store_videos(name, client, generated_video):
  client.files.download(file=generated_video.video)
  generated_video.video.save(f"new_video{name}.mp4")  # save the video

@weave.op()
def save_videos_in_weave(video_path):
    clip = VideoFileClip(video_path, has_mask=False, audio=True)
    new_clip = clip.subclip(0, 1)
    return new_clip

@weave.op()
def generate_videos(prompt):
    client = genai.Client()  # read API key from GOOGLE_API_KEY
    operation = client.models.generate_videos(
        model="veo-2.0-generate-001",
        prompt="Panning wide shot of a calico kitten sleeping in the sunshine",
        config=types.GenerateVideosConfig(
            person_generation="dont_allow",  # "dont_allow" or "allow_adult"
            aspect_ratio="16:9",  # "16:9" or "9:16"
            ),
        )
    
    while not operation.done:
        time.sleep(10)
        operation = client.operations.get(operation)
    
    for n, generated_video in enumerate(operation.response.generated_videos):
        client.files.download(file=generated_video.video)
        generated_video.video.save(f"video{n}.mp4")  # save the video
        save_videos_in_weave(f"video{n}.mp4")
        
 
generate_videos("Panning wide shot of a calico kitten sleeping in the sunshine")


[36m[1mweave[0m: Flushing 1 pending tasks...


→ Flushing tasks: 0.0% (0/1)
→ Flushing tasks: 0.0% (0/1)
→ Flushing tasks: 0.0% (0/1)
→ Flushing tasks: 0.0% (0/1)
→ Flushing tasks: 0.0% (0/1)
→ Flushing tasks: 100.0% (1/1)
✓ Flushing tasks: 100.0% (1/1)


[36m[1mweave[0m: wandb version 0.22.2 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: keisuke-kamata.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-japan/weave-handson-20251015/weave


ValueError: Missing key inputs argument! To use the Google AI API, provide (`api_key`) arguments. To use the Google Cloud API, provide (`vertexai`, `project` & `location`) arguments.

[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-japan/weave-handson-20251015/r/call/0199e585-b66e-7723-a3a5-904a0b3be4ce
