## 概要

- `gemini`を使用して、論文のアブストラクトの評価を行うスクリプト
- `gemini`の出力は JSON 形式に固定
- `rules`で評価指標を指定
- ``で用語等の定義を指定


In [29]:
# 初期設定
!pip install pandas
import pathlib
import textwrap
import google.generativeai as genai
import os
from dotenv import load_dotenv
from IPython.display import display
from IPython.display import Markdown
import pandas as pd



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [30]:
file_name = "Biochemistry_Molecular_Biology_low1000"  # ファイル名
input_file = f"../data/csv/{file_name}.csv"  # タブ区切りの.txtファイル
output_file = f"../data/result/{file_name}.csv"  # 出力する.csvファイル

try:
    df = pd.read_csv(input_file, encoding="utf-8")
    print("データの読み込みに成功しました。")
except Exception as e:
    print(f"エラーが発生しました: {e}")

データの読み込みに成功しました。


In [31]:
df.head()

Unnamed: 0,Publication Type,Authors,Title,Abstract,DOI
0,J,"Athare, SV; Gejji, SP",Regioselectivity in nonsymmetric methyl pentyl...,The present work illustrates regioselective bi...,10.1016/j.jmgm.2019.107960
1,J,"Brunetti, M; Mortola, JP",Hypoxic hypometabolism in chicken embryos: con...,"Postnatally, during hypoxia the decrease in ox...",10.1016/j.cbpa.2019.110578
2,J,"Du, ZF; Qu, Y; Farrell, NP",Intramolecular platinum migration on a peptide...,We report the migration of platinum ligand uni...,10.1016/j.jinorgbio.2019.110858
3,B,"Jahn, D; Geier, A",Transcriptional control of cells by vitamin D ...,,10.1016/B978-0-12-811907-5.00030-0
4,J,"Pakravan, M; Shamsollahi, MB","Spatial and temporal joint, partially-joint an...",absectionBackground Three types of sources can...,10.1016/j.jneumeth.2019.108453


In [32]:
# モデルのインスタンスを作成
model = genai.GenerativeModel(
    "gemini-1.5-flash",
    generation_config={"response_mime_type": "application/json"}
)

In [None]:
# 入力部分
Abstract = "Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers-8x deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions(1), where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation."

In [None]:
# 用語等の定義
definition = """
用語等の定義を入力
"""

# 指示部分
instruction = """
Please answer yes or no if your abstract follows each of the rules in JSON format.

Use this JSON schema:

results = {'abstract_id': int, 'rules': list[str]}
Return: list[results]

"""

# アブストラクト
abstract = f"""
Abstract: {Abstract}
"""

# 評価指標
rules = """
Rules:
1. The objectives, methods, results, and conclusions are clearly stated.
2. It avoids the passive voice and uses active expressions.
3. Separating facts from claims.
"""

# プロンプトの作成
prompt = f"{instruction}\n{abstract}\n{definition}\n{rules}" 

In [35]:
# geminiを使った回答の生成
def generate_response(model, prompt):
    response = model.generate_content(prompt)
    return response


In [36]:
print(generate_response(model, prompt).text)

[{"abstract_id": 1, "rules": ["yes", "yes", "yes"]}]
