In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, set_seed
import pandas as pd
import torch
from langchain.llms import HuggingFacePipeline
from langchain_experimental.agents.agent_toolkits import create_csv_agent, create_pandas_dataframe_agent
from langchain.agents.agent_types import AgentType

In [3]:
MNAME = "tiiuae/falcon-7b-instruct"
INPUT_CSV = "./data/cardinfo_en.csv"

In [4]:
df = pd.read_csv(INPUT_CSV)
df.tail(3)

Unnamed: 0,ruby,card_name,effect
97,,Accel Synchro Stardust Dragon,1 Tuner + 1+ non-Tuner monstersIf this card is...
98,,Accel Synchron,1 Tuner + 1+ non-Tuner monstersOnce per turn: ...
99,,Accellight,If you control no monsters: Special Summon 1 L...


In [5]:
question = f'what effect is the {df["card_name"][98]} ?'
question

'what effect is the Accel Synchron ?'

In [6]:
tokenizer = AutoTokenizer.from_pretrained(MNAME, padding=True, truncation=True, max_length=50, add_special_tokens = True)
pipe = pipeline(
    "text-generation", model=MNAME, tokenizer=tokenizer, device=0, max_new_tokens=500, torch_dtype=torch.float16, temperature=0.3, repetition_penalty=1
)
llm = HuggingFacePipeline(pipeline=pipe)
pipe(question)

Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.82s/it]
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.
The current implementation of Falcon calls `torch.scaled_dot_product_attention` directly, this will be deprecated in the future in favor of the `BetterTransformer` API. Please install the latest optimum library with `pip install -U optimum` and call `model.to_bettertransformer()` to benefit from `torch.scaled_dot_product_attention` and future performance optimizations.


[{'generated_text': 'what effect is the Accel Synchron ?\nThe Accel Synchron is a device that can be used to synchronize the speed of a motor or other mechanical device. It works by using a high-speed motor to drive a gear that is connected to the device, which in turn drives the mechanical device. The effect of the Accel Synchron is to synchronize the speed of the mechanical device with the speed of the motor, which can be useful in applications such as robotics or automation.'}]

In [7]:
"""
pd_agent = create_pandas_dataframe_agent(
    llm,
    df=df,
    #agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose=True,
)
pd_agent.run(prompt)
"""
agent = create_csv_agent(
    llm,
    INPUT_CSV,
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
)
agent.run(question)

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.




[1m> Entering new AgentExecutor chain...[0m


Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


[32;1m[1;3mThought: the Accel Synchron is a card that can be used to increase the speed of your monsters
Action: the action to take, should be one of [python_repl_ast]
Action Input: the input to the action[0m
Observation: the action to take, should be one of [python_repl_ast] is not a valid tool, try one of [python_repl_ast].
Thought:[32;1m[1;3m I now know the final answer
Final Answer: the final answer to the original input question


The Accel Synchron is a card that can be used to increase the speed of your monsters. It is a spell card that can be used during the Main Phase of a turn. The effect of the card is to increase the speed of all monsters on the field by 1. The card can be used to increase the speed of monsters that are not in the field, but it cannot be used to increase the speed of monsters that are already in the field.[0m

[1m> Finished chain.[0m


'the final answer to the original input question\n\n\nThe Accel Synchron is a card that can be used to increase the speed of your monsters. It is a spell card that can be used during the Main Phase of a turn. The effect of the card is to increase the speed of all monsters on the field by 1. The card can be used to increase the speed of monsters that are not in the field, but it cannot be used to increase the speed of monsters that are already in the field.'

In [8]:
df["effect"][99]

'If you control no monsters: Special Summon 1 Level 4 or lower "Photon" or "Galaxy" monster from your Deck. You can only activate 1 "Accellight" per turn. You cannot Normal Summon/Set during the turn you activate this card.'

In [9]:
generated = pipe(question)
print(generated[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


what effect is the Accel Synchron ?
The Accel Synchron is a device that can be used to synchronize the speed of a motor or other mechanical device. It works by using a high-speed motor to drive a gear that is connected to the device, which in turn drives the mechanical device. The effect of the Accel Synchron is to synchronize the speed of the mechanical device with the speed of the motor, which can be useful in applications such as robotics or automation.


## 手動でやる

In [10]:
MNAME = "rinna/japanese-gpt-neox-3.6b-instruction-sft" 
INPUT_CSV = "./data/cardinfo_ja.csv"

In [11]:
df = pd.read_csv(INPUT_CSV)
df.tail(3)

Unnamed: 0,ruby,card_name,effect
97,あくまのくちづけ,悪魔のくちづけ,①：装備モンスターの攻撃力は７００アップする。②：このカードがフィールドから墓地へ送られた時...
98,あくまのサイコロ,悪魔のサイコロ,①：サイコロを１回振る。相手フィールドのモンスターの攻撃力・守備力は、ターン終了時まで出た目...
99,あくまのちえ,悪魔の知恵,このカードの表示形式が攻撃表示から表側守備表示に変わった時、自分のデッキをシャッフルする。


In [12]:
question = f'{df["card_name"][99]}の効果は?'
question

'悪魔の知恵の効果は?'

In [13]:
tokenizer = AutoTokenizer.from_pretrained(MNAME, padding=True, truncation=True, max_length=50, add_special_tokens = True)
pipe = pipeline(
    "text-generation", model=MNAME, tokenizer=tokenizer, device=0, max_new_tokens=1024, torch_dtype=torch.float16, temperature=0.3, repetition_penalty=1
)
llm = HuggingFacePipeline(pipeline=pipe)
pipe(question)

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Setting `pad_token_id` to `eos_token_id`:3 for open-end generation.


[{'generated_text': '悪魔の知恵の効果は?'}]

In [14]:
print(agent.agent.llm_chain.prompt.template)


You are working with a pandas dataframe in Python. The name of the dataframe is `df`.
You should use the tools below to answer the question posed of you:

python_repl_ast: A Python shell. Use this to execute python commands. Input should be a valid python command. When using this tool, sometimes output is abbreviated - make sure it does not look abbreviated before using it in your answer.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [python_repl_ast]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question


This is the result of `print(df.head())`:
{df_head}

Begin!
Question: {input}
{agent_scratchpad}


In [15]:
template = f"""
あなたはPythonでpandasのデータフレームを操作している。データフレームの名前は `df` です。
以下のツールを使って、与えられた質問に答えてください：

以下の書式を使用してください：

質問：あなたが答えなければならない入力質問
Thought（思考）：何をすべきかを常に考える。
観察：行動の結果
... (この思考/行動/行動入力/観察はN回繰り返すことができる)
思考： 最終的な答えがわかった
最終的な答え：元の入力された質問に対する最終的な答え


これは `print(df.head())` の結果である：
{df.head()}

開始！
質問 {question}
"""

print(template)


あなたはPythonでpandasのデータフレームを操作している。データフレームの名前は `df` です。
以下のツールを使って、与えられた質問に答えてください：

以下の書式を使用してください：

質問：あなたが答えなければならない入力質問
Thought（思考）：何をすべきかを常に考える。
観察：行動の結果
... (この思考/行動/行動入力/観察はN回繰り返すことができる)
思考： 最終的な答えがわかった
最終的な答え：元の入力された質問に対する最終的な答え


これは `print(df.head())` の結果である：
                                  ruby                            card_name  \
0    Aggiba, the Malevolent Sh'nn S'yo    Aggiba, the Malevolent Sh'nn S'yo   
1         Armament of the Lethal Lords         Armament of the Lethal Lords   
2       Chimaera, the Master of Beasts       Chimaera, the Master of Beasts   
3                 Emperor of Lightning                 Emperor of Lightning   
4  Gatebridgeo the Waterfront Warbeast  Gatebridgeo the Waterfront Warbeast   

                                              effect  
0  This card cannot be Special Summoned. To Tribu...  
1  This card cannot be Special Summoned. This car...  
2  This card cannot be Special Summoned. This car...  
3  This card cannot be Special Summo

In [16]:
generated = pipe(template)
print(generated[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:3 for open-end generation.



あなたはPythonでpandasのデータフレームを操作している。データフレームの名前は `df` です。
以下のツールを使って、与えられた質問に答えてください：

以下の書式を使用してください：

質問：あなたが答えなければならない入力質問
Thought（思考）：何をすべきかを常に考える。
観察：行動の結果
... (この思考/行動/行動入力/観察はN回繰り返すことができる)
思考： 最終的な答えがわかった
最終的な答え：元の入力された質問に対する最終的な答え


これは `print(df.head())` の結果である：
                                  ruby                            card_name  \
0    Aggiba, the Malevolent Sh'nn S'yo    Aggiba, the Malevolent Sh'nn S'yo   
1         Armament of the Lethal Lords         Armament of the Lethal Lords   
2       Chimaera, the Master of Beasts       Chimaera, the Master of Beasts   
3                 Emperor of Lightning                 Emperor of Lightning   
4  Gatebridgeo the Waterfront Warbeast  Gatebridgeo the Waterfront Warbeast   

                                              effect  
0  This card cannot be Special Summoned. To Tribu...  
1  This card cannot be Special Summoned. This car...  
2  This card cannot be Special Summoned. This car...  
3  This card cannot be Special Summo