# OpenReview mental model

This notebook explores the relations between different notes in OpenReview that represent revisions to manuscripts and comments on the platform. We will follow the details of the paper [Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets](https://openreview.net/forum?id=Skh4jRcKQ). (It's the first paper in the ICLR 2019 forum iterator that has updates to its manuscript.)

In [49]:
# Setup and helpers

from datetime import datetime
import openreview
import pandas as pd

guest_client = openreview.Client(baseurl='https://api.openreview.net')

FORUM_ID = "Skh4jRcKQ"

def clean_date(timestamp):
  return datetime.fromtimestamp(int(timestamp/1000)).strftime("%m/%d/%Y, %H:%M")

## Selecting notes of interest

There are a number of notes associated with this forum ID. We distinguish the 'top note', whose note ID is the same as the forum ID.

`guest_client.get_notes(forum=FORUM_ID)` retrieves all notes (comments and submissions) associated with this forum.

In [32]:
all_notes = guest_client.get_notes(forum=FORUM_ID)
top_note, = [note for note in all_notes if note.id == note.forum] # Exactly one
#all_other_notes = [note for note in all_notes if not note.id == note.forum]
print("Total number of notes including top note: ", len(all_notes))
print("Top note id == Top note forum == FORUM_ID: ", top_note.id == top_note.forum == FORUM_ID)

Total number of notes including top note:  16
Top note id == Top note forum == FORUM_ID:  True


## Getting the pdfs from the forum-level note

This note has `id=Skh4jRcKQ` and `original=HkxB3XguKm`. This is because get_notes returns the `BlindSubmission` note. The original note (`HkxB3XguKm`) is the one to which all the PDF revisions are attached.

This is explained [here](https://openreview-py.readthedocs.io/en/latest/mental_models.html).

To get the different pdfs we have to request references with `original=True`. When we request the revisions of the blind submission, we don't get much. But we get all the PDF revisions when we use the flag `original=True`. This is effectively requesting the references of `HkxB3XguKm` instead of `Skh4jRcKQ`.

In [39]:
print("top_note.id:", top_note.id)
print("top_note.original:", top_note.original)

# Quick helper function to compare reference lists
def get_reference_ids(referent_id, original_flag):
  return [x.id for x in guest_client.get_references(referent=referent_id, original=original_flag)]

original_true_references = get_reference_ids(top_note.id, True)
original_false_references = get_reference_ids(top_note.id, False)
print("Same results retrieved with original=True and original=False: ", original_true_references == original_false_references)
print("Num. results retreived with original=True: ", len(original_true_references))
print("Num. results retreived with original=False: ", len(original_false_references))


top_note.id: Skh4jRcKQ
top_note.original: HkxB3XguKm
Same results retrieved with original=True and original=False:  False
Num. results retreived with original=True:  32
Num. results retreived with original=False:  1


## Exploring the original note

We will retrieve the original note _id_ into a variable for later use. We cannot retrieve the Note _object_ of the original.

In [40]:
top_note_original_id = top_note.original
try:
  temp = guest_client.get_notes(id=top_note.original)[0]
except openreview.OpenReviewException as e:
  print("Could not retrieve a Note object for the top note's original note.\nError message: ", e)

Could not retrieve a Note object for the top note's original note.
Error message:  {'name': 'ForbiddenError', 'message': 'User Guest does not have permission to see Note HkxB3XguKm', 'status': 403, 'details': {'user': 'guest_1666908803345', 'reqId': '2022-10-27-1410388'}}


**Question**: What is the difference between:
```
guest_client.get_references(referent=top_note.original, original=False)
guest_client.get_references(referent=top_note.id, original=True)
```

In [47]:
a = guest_client.get_references(referent=top_note.id, original=False)
print(len(a))
print(a[0].referent, a[0].original)
a = guest_client.get_references(referent=top_note.id, original=True)
print(len(a))
print(a[3].referent, a[3].original)

1
Skh4jRcKQ HkxB3XguKm
32
Skh4jRcKQ HkxB3XguKm


Also, the following two requests are sometimes equivalent:

```
guest_client.get_references(referent=note.original, original=True))
guest_client.get_references(referent=note.original, original=False))
```

**Question**: Why? Is it because `HkxB3XguKm` has no original? I can't check because I can't access `HkxB3XguKm`.

**Question**: What does it mean for an original note to have a PapersWithCode reference?

In [33]:
print(f'Requesting references with original=True and False for top_note.original ({top_note.original})')
l0 = [x.id for x in guest_client.get_references(referent=top_note.original, original=True)]
l1 = [x.id for x in guest_client.get_references(referent=top_note.original, original=False)]
print("l0 and l1 equal:", l0 == l1)

Requesting references with original=True and False for top_note.original (HkxB3XguKm)
l0 and l1 equal: True
[Note(id = 'SYWVoY_zkq',original = None,number = 619,cdate = 1644493211604,mdate = 1644493211604,tcdate = 1644493211604,tmdate = 1644493211604,ddate = None,content = {'data': '[CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [MNIST](https://paperswithcode.com/dataset/mnist)'},forum = 'HkxB3XguKm',referent = 'HkxB3XguKm',invitation = 'PapersWithCode.com/-/ICLR_2019_Code',replyto = None,readers = ['everyone'],nonreaders = [],signatures = ['PapersWithCode.com'],writers = [],details = {})]


# Dates for references to Blind Submission
There is just one reference to the blind submission. It has the same creation date, but a later modification date which reflects a 2022 OpenReview PapersWithCode update.

In [7]:
reference_to_top_note, = list(guest_client.get_references(referent=top_note.id))
print('cdates of top note and reference to top note equal?', top_note.cdate == reference_to_top_note.cdate)
print('tcdates of top note and reference to top note equal', top_note.tcdate == reference_to_top_note.tcdate)
print('tmdates of top note and reference to top note equal', top_note.tmdate == reference_to_top_note.tmdate)
print()
print('tmdate of top note: ', clean_date(top_note.tmdate) )
print('tmdate of reference to top note: ', clean_date(reference_to_top_note.tmdate) )

cdates of top note and reference to top note equal? True
tcdates of top note and reference to top note equal True
tmdates of top note and reference to top note equal False

tmdate of top note:  02/10/2022, 06:40
tmdate of reference to top note:  12/20/2018, 20:23


# Dates for references to original note
All the references to the original note have the same cdate, which is in turn the same as the cdate of the blind submission. Each note has a different tcdate and tmdate

**Question**: Aren't notes supposed to all have the same tcdate, but different tmdates?

In [8]:
references_to_top_note = list(guest_client.get_references(referent=top_note.id, original=True))
cdates = [ref.cdate for ref in references_to_top_note]
tcdates = [ref.tcdate for ref in references_to_top_note]
tmdates = [ref.tmdate for ref in references_to_top_note]

print("Number of unique cdates among references", len(set(cdates)))
print("Number of unique tcdates among references", len(set(tcdates)))
print("Number of unique tmdates among references", len(set(tmdates)))

print("cdate of references: ", clean_date(cdates[0]))
print("cdate of top note: ", clean_date(top_note.cdate))

Number of unique cdates among references 1
Number of unique tcdates among references 32
Number of unique tmdates among references 32
cdate of references:  09/27/2018, 18:35
cdate of top note:  09/27/2018, 18:35


In [23]:


all_retrievals_same_flag = True
for note in guest_client.get_notes(forum=top_note.id):
  if note.id == note.forum:
    continue
  else:
    l_true = get_reference_ids(note.id, True)
    l_false = get_reference_ids(note.id, False)
    if not l_true == l_false:
      print("Different results retrieving with original=True and original=False.")
      all_retrievals_same_flag = False
      break

if all_retrievals_same_flag:
  print("original=True and original=False all returned the same results.")

original=True and original=False all returned the same results.


In [30]:
COMMENT_ID = "rkey8cX9hm"
temp, = guest_client.get_notes(id=COMMENT_ID)
#print(temp)
l0 = [x.id for x in guest_client.get_references(referent=COMMENT_ID, original=True)]
l1 = [x.id for x in guest_client.get_references(referent=COMMENT_ID, original=False)]
print(l0==l1, len(l0))
#print(temp)
for reference in  guest_client.get_references(referent=COMMENT_ID):
  print(reference.id, reference.replyto, reference.original, reference.referent, reference.forum)
  #print(reference)

True 3
B100rKh0X Skh4jRcKQ None rkey8cX9hm Skh4jRcKQ
S14grK2A7 Skh4jRcKQ None rkey8cX9hm Skh4jRcKQ
r1JU9Xq27 Skh4jRcKQ None rkey8cX9hm Skh4jRcKQ


In [50]:
dicts = []
for referent in [top_note.id, top_note.original]:
  for original_flag in [True, False]:
    for reference in guest_client.get_references(referent=referent, original=original_flag):
      dicts.append({
        "referent": referent,
        "original_flag": str(original_flag),
        "id": reference.id,
        "original": reference.original,
        "referent_value": reference.referent
      })
      
pd.DataFrame.from_dict(dicts)

Unnamed: 0,referent,original_flag,id,original,referent_value
0,Skh4jRcKQ,True,SYWVoY_zkq,HkxB3XguKm,Skh4jRcKQ
1,Skh4jRcKQ,True,BJd8HgFvS,HkxB3XguKm,Skh4jRcKQ
2,Skh4jRcKQ,True,SJgOppV24,HkxB3XguKm,Skh4jRcKQ
3,Skh4jRcKQ,True,SyQCYkbhN,HkxB3XguKm,Skh4jRcKQ
4,Skh4jRcKQ,True,SkKbD1-nE,HkxB3XguKm,Skh4jRcKQ
5,Skh4jRcKQ,True,HJC0wMdoV,HkxB3XguKm,Skh4jRcKQ
6,Skh4jRcKQ,True,ByHQ568wE,HkxB3XguKm,Skh4jRcKQ
7,Skh4jRcKQ,True,Hk80Ya8DN,HkxB3XguKm,Skh4jRcKQ
8,Skh4jRcKQ,True,BkLEP_mUV,HkxB3XguKm,Skh4jRcKQ
9,Skh4jRcKQ,True,SyeajMtE4,HkxB3XguKm,Skh4jRcKQ


## Some questions about the above table

* How can we distinguish between, say, `B1BnmldK7` (31) and `BJoEsC5tm` (32)? Both of them have the same value for referent and for original. But one is only retrieved when original is `False`, and the other only when original is `True`.
* I would expect that calling get_references with top_note.id and original=True should be equivalent to calling it with top_note.original and original=False. i.e. setting the original flag causes references to the original to be retrieved instead. However, the latter case is shown in 34.
  * What does it mean to be a reference to the original note?
  * Why are lines 33 and 34 equivalent? Is it because `SYWVoY_zkq` does not have an original value? I cannot check this because I do not have access to the note `SYWVoY_zkq`, see below.
  * What distinguishes, say, `BJd8HgFvS` (2) from `SYWVoY_zkq` (1)?
    * `SYWVoY_zkq` is also retrieved in two other cases (33, 34), but I can't access `BJd8HgFvS`.