<div style="background:#00000">
<div> 
<h1>Starting Kit // <b>PEER-REVIEWED JOURNAL OF AI-AGENTS 2023</b> </h1>
<p>
This starting kit will guide you step by step and will walk you through the data statistics and examples. This will give you a clear idea of what this challenge is about and how you can proceed further to solve the challenge.
</p> 
</div>
<div>
<br/><br/>
<details>
<summary>Technical Details</summary>
<p>
This code was tested with <a href="https://www.python.org/downloads/release/python-385/">Python 3.8.5</a> | <a href="https://anaconda.org/">Anaconda</a> custom (64-bit) | (default, Dec 23 2020, 21:19:02).
</p>
</details>

<details> 
<summary>Disclaimer</summary>
  <p>
  ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED "AS-IS". The CHALEARN, AND/OR OTHER ORGANIZERS OR CODE AUTHORS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE, AND THE WARRANTY OF NON-INFRIGEMENT OF ANY THIRD PARTY'S INTELLECTUAL PROPERTY RIGHTS. IN NO EVENT SHALL AUTHORS AND ORGANIZERS BE LIABLE FOR ANY SPECIAL, 
  INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF SOFTWARE, DOCUMENTS, MATERIALS, PUBLICATIONS, OR INFORMATION MADE AVAILABLE FOR THE CHALLENGE. 
  </p>
</details>
<details> 
<summary>References and Credits </summary>
  <p>
  <ul>
      <li><a href="https://www.universite-paris-saclay.fr/">Université Paris Saclay</a></li>
      <li><a href="http://www.chalearn.org/">ChaLearn</a></li>
  </ul>
  </p>
</details> 

<details open> 
<summary>Collaborators</summary>
  <p>
This challenge was organized by <b>Isabelle Guyon</b>, <b>Benedictus Kent Rachmat</b> and <b> Khuong Thanh Gia Hieu</b>
  </p>
</details> 
</div>
<hr>

<button type="button" ><h3><a href="https://www.codabench.org/competitions/866/">Competition Site</a></h3></button> 
<button type="button" ><h3><a href="https://sites.google.com/view/ai-agents/home">Landing Page Site</a></h3></button> 
<button type="button"><h3><a href="mailto:ai-agent-journal@chalearn.org">Contact us</a></h3></button>
<hr>
</div>

# Step 1: Install required packages

In [None]:
!git clone https://github.com/kentrachmat/public_ai_paper_challenge_codabench.git

In [None]:
%cd public_ai_paper_challenge_codabench

In [None]:
pip install openai

## Import the necessary libraries

In [None]:
import numpy as np

In [None]:
MODEL_DIR = 'sample_code_submission/' # Change the model to a better one once you have one!
RESULT_DIR = 'sample_result_submission/' 
PROBLEM_GENERATOR_DIR = 'generator_ingestion_program/'  
SCORE_GENERATOR_DIR = 'generator_scoring_program/'
PROBLEM_REVIEWER_DIR = 'reviewer_ingestion_program/'  
SCORE_REVIEWER_DIR = 'reviewer_scoring_program/'
DATA_DIR = 'sample_data'  
DATA_NAME = 'ai_paper_challenge' # DO NOT CHANGE
SCORING_OUTPUT_DIR = 'scoring_output'

# from sys import path; path.append(MODEL_DIR); path.append(PROBLEM_DIR); path.append(SCORE_DIR); 
%matplotlib inline
# Uncomment the next lines to auto-reload libraries (this causes some problem with pickles in Python 3)
%load_ext autoreload
%autoreload 2

import seaborn as sns; sns.set()
import warnings

warnings.simplefilter(action='ignore', category=FutureWarning)
%reload_ext autoreload

***
# Step 2: Exploratory data analysis
We provide `sample_data` with the starting kit which contains 3 prompts for the AI-Author track and 51 paper for the AI-reviewer track

### Load Data

In [None]:
from generator_ingestion_program.data_io import read_data as read_data_generator
from reviewer_ingestion_program.data_io import read_data as read_data_reviewer

In [None]:
data_generator, _ = read_data_generator(DATA_DIR, random_state=42)
data_reviewer, _ = read_data_reviewer(DATA_DIR, random_state=42)

### Data Statistics

In [None]:
print("Total Generation prompts: ", len(np.unique(data_generator['generator']['ids'])))
print("Total Reviewer papers : ", len(np.unique(data_reviewer['reviewer']['ids'])))

### Visualization

In [None]:
print("Instruction for the AI-Author track:")
print(data_generator['generator']['instructions'])

In [None]:
print("Instruction for the AI-Reviewer track:")
print(data_reviewer['reviewer']['instructions'])

In [None]:
# Generator prompts
print("Generator prompts:")
for p in data_generator['generator']['prompts']:
    print(p)
    print("Length:",len(p))
    print()

In [None]:
# Reviewer texts
data_reviewer['reviewer']['papers'][0]

***
# Step 2: Building a predictive model
We provided you 2 main function for your baseline model. In this occation, we implemented using [OPENAI API](https://platform.openai.com/docs/introduction). Please follow the [instruction](https://github.com/kentrachmat/public_ai_paper_challenge_codabench/blob/master/how_to_get_openai_api.md) to get the API keys. Feel free modify and be creative!

- `generate_papers(prompts, instruction)` where `prompts` is the given prompt and `instruction` to instruct the OPENAI API
- `review_papers(papers, instruction)` where `papers` is the given paper to be reviewed and `instruction` to instruct the OPENAI API

In [None]:
from generator_ingestion_program.data_io import write as write_generator
from reviewer_ingestion_program.data_io import write as write_reviewer
from sample_code_submission.model import model

In [None]:
myModel = model()
# myModel.set_api_key("YOUR API") # and change sample_code_submission/sample_submission_chatgpt_api_key.json

generator_X = data_generator["generator"]["prompts"]
reviewer_X = data_reviewer["reviewer"]["papers"]

In [None]:
# Generator 
generator_Y_hat = myModel.generate_papers(generator_X, data_generator["generator"]["instructions"])

In [None]:
# Reviewer
reviewer_Y_hat = myModel.review_papers(reviewer_X, data_reviewer["reviewer"]["instructions"])

In [None]:
# save generator and reviewer results
result_name = RESULT_DIR + DATA_NAME
write_generator(result_name + '_generator.predict', generator_Y_hat)

result_name = RESULT_DIR + DATA_NAME
write_reviewer(result_name + '_reviewer.predict', reviewer_Y_hat)

!ls $result_name*

***
# Step 3: Making a submission

## Unit testing

It is <b><span style="color:red">important that you test your submission files before submitting them</span></b>. All you have to do to make a submission is modify the file <code>model.py</code> in the <code>sample_code_submission/</code> directory, then run this test to make sure everything works fine. This is the actual program that will be run on the server to test your submission. 
<br>
Keep the sample code simple.<br>

<code>python3</code> is required for this step

In [None]:
# AI-Reviewer Unit Testing
!python3 $PROBLEM_GENERATOR_DIR/ingestion.py $DATA_DIR $RESULT_DIR $PROBLEM_GENERATOR_DIR $MODEL_DIR

In [None]:
# AI-Reviewer Unit Testing
!python3 $PROBLEM_REVIEWER_DIR/ingestion.py $DATA_DIR $RESULT_DIR $PROBLEM_REVIEWER_DIR $MODEL_DIR

## Test scoring program

We provided a simple baseline reviewer for both your generated paper and your reviewer. You can find results in the `scoring_output/` folder.

As for the submission on Codabench, you can download "Output from scoring step". The evaluation on Codabench is different from the provided baseline reviewer.

In [None]:
# AI-Author Test Scoring 
!python3 $SCORE_GENERATOR_DIR/score.py $DATA_DIR $RESULT_DIR $SCORING_OUTPUT_DIR

In [None]:
# AI-Reviewer Test Scoring 
!python3 $SCORE_REVIEWER_DIR/score.py $DATA_DIR $RESULT_DIR $SCORING_OUTPUT_DIR

# Step 4: Prepare the submission
Submit the zip file to codabench so that you can get a numeric as well as text feedback.

In [None]:
import datetime 
from generator_ingestion_program.data_io import zipdir as zipdir_generator
# from reviewer_ingestion_program.data_io import zipdir as zipdir_reviewer

the_date = datetime.datetime.now().strftime("%y-%m-%d-%H-%M")
sample_code_submission = 'sample_code_submission_' + the_date + '.zip'
zipdir_generator(sample_code_submission, MODEL_DIR, exclude_folders=['__pycache__'], exclude_files=[f'{DATA_NAME}_model.pickle'])
# zipdir_reviewer(sample_code_submission, MODEL_DIR, exclude_folders=['__pycache__'], exclude_files=[f'{DATA_NAME}_model.pickle'])
print("Submit this file to codalab:\n" + sample_code_submission)