<a href="https://colab.research.google.com/github/leonardo3108/IA368dd/blob/main/exercicios/Aula_8/Aula_8_SPLADE.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
nome = 'Leonardo Augusto da Silva Pacheco'
print(f'Meu nome é {nome}')

Meu nome é Leonardo Augusto da Silva Pacheco


# Enunciado

Implementar a fase de indexação e buscas de um modelo sparso
- Usar este modelo SPLADE já treinado naver/splade_v2_distil (do distilbert) ou splade-cocondenser-selfdistil (do BERT-base 110M params). Mais informações sobre os modelos estão neste artigo: https://arxiv.org/pdf/2205.04733.pdf
- Não é necessário treinar o modelo
- Avaliar nDCG@10 no TREC-COVID e comparar resultados com o BM25 e buscador denso da semana passada

A dificuldade do exercício está em implementar a função de busca e ranqueamento usada pelo SPLADE. A implementação deve ser codificada e usar implementação do SPLADE apenas para comparação. A implementação do índice invertido é apenas um "dicionário python".

- Fazer a comparação dos seus resultados com a busca "original" do SPLADE.
- Medir latência (s/query)


# Setup

## Integração com Google Drive

In [50]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Hiperparâmetros

In [3]:
max_length = 256
batch_size = 64
model_name = 'naver/splade_v2_distil'

## Instalação de libs

In [4]:
!pip install transformers 
!pip install datasets
!pip install sentence-transformers
!pip install pyserini
!pip install faiss-gpu

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.28.1-py3-none-any.whl (7.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m57.0 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.14.1-py3-none-any.whl (224 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m224.5/224.5 kB[0m [31m27.4 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m101.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.14.1 tokenizers-0.13.3 transformers-4.28.1
Looking in indexes: https://pypi.org/simple, https://u

## Importação de libs

In [5]:
import json
import numpy as np
import pickle
import torch
from torch.nn.functional import relu
from torch.utils import data
from torch.utils.data import DataLoader
from tqdm.auto import tqdm
from transformers import AutoModelForMaskedLM, AutoTokenizer, BatchEncoding

## Sementes

In [6]:
np.random.seed(42)

## Utilização de GPUs

In [7]:
if torch.cuda.is_available(): 
   dev = "cuda:0"
else: 
   dev = "cpu"
device = torch.device(dev)
print('Using {}'.format(device))

Using cuda:0


In [8]:
if dev != 'cpu':
    !nvidia-smi

Wed Apr 26 20:37:59 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   57C    P8    11W /  70W |      3MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# Preparação dos dados

## Obtenção - TREC-COVID

In [9]:
!wget -nc 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/trec-covid.zip'

--2023-04-26 20:37:59--  https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/trec-covid.zip
Resolving public.ukp.informatik.tu-darmstadt.de (public.ukp.informatik.tu-darmstadt.de)... 130.83.167.186
Connecting to public.ukp.informatik.tu-darmstadt.de (public.ukp.informatik.tu-darmstadt.de)|130.83.167.186|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 73876720 (70M) [application/zip]
Saving to: ‘trec-covid.zip’


2023-04-26 20:38:03 (19.8 MB/s) - ‘trec-covid.zip’ saved [73876720/73876720]



In [10]:
!unzip -o trec-covid.zip

Archive:  trec-covid.zip
   creating: trec-covid/
   creating: trec-covid/qrels/
  inflating: trec-covid/qrels/test.tsv  
  inflating: trec-covid/corpus.jsonl  
  inflating: trec-covid/queries.jsonl  


## Tratamento - qrels de test

In [11]:
with open('trec-covid/qrels/test.tsv', 'r') as fin:
  data = fin.read().splitlines(True)

print(data[:5])
print(data[4].split())

['query-id\tcorpus-id\tscore\n', '1\t005b2j4b\t2\n', '1\t00fmeepz\t1\n', '1\tg7dhmyyo\t2\n', '1\t0194oljo\t1\n']
['1', '0194oljo', '1']


In [12]:
with open('test_adjusted.tsv', 'w') as fout:
    for line in data[1:]:
        query_id, corpus_id, score = line.split()
        fout.write(f'{query_id}\t0\t{corpus_id}\t{score}\n')

## Tratamento - corpus

In [13]:
corpus = []
with open('trec-covid/corpus.jsonl') as fin:
    for i, line in enumerate(fin):
        doc = json.loads(line)
        corpus.append((doc['_id'], f"{doc['title']} {doc['text']}"))

for text in corpus[:10]:
    print(text)

('ug7v899j', 'Clinical features of culture-proven Mycoplasma pneumoniae infections at King Abdulaziz University Hospital, Jeddah, Saudi Arabia OBJECTIVE: This retrospective chart review describes the epidemiology and clinical features of 40 patients with culture-proven Mycoplasma pneumoniae infections at King Abdulaziz University Hospital, Jeddah, Saudi Arabia. METHODS: Patients with positive M. pneumoniae cultures from respiratory specimens from January 1997 through December 1998 were identified through the Microbiology records. Charts of patients were reviewed. RESULTS: 40 patients were identified, 33 (82.5%) of whom required admission. Most infections (92.5%) were community-acquired. The infection affected all age groups but was most common in infants (32.5%) and pre-school children (22.5%). It occurred year-round but was most common in the fall (35%) and spring (30%). More than three-quarters of patients (77.5%) had comorbidities. Twenty-four isolates (60%) were associated with pne

## Tratamento - queries

In [14]:
queries = []
with open('trec-covid/queries.jsonl') as fin:
    for line in fin:
      query = json.loads(line)
      queries.append({'id': query['_id'], 'text': query['text']})

for query in queries[:10]:
    print(query)      

{'id': '1', 'text': 'what is the origin of COVID-19'}
{'id': '2', 'text': 'how does the coronavirus respond to changes in the weather'}
{'id': '3', 'text': 'will SARS-CoV2 infected people develop immunity? Is cross protection possible?'}
{'id': '4', 'text': 'what causes death from Covid-19?'}
{'id': '5', 'text': 'what drugs have been active against SARS-CoV or SARS-CoV-2 in animal studies?'}
{'id': '6', 'text': 'what types of rapid testing for Covid-19 have been developed?'}
{'id': '7', 'text': 'are there serological tests that detect antibodies to coronavirus?'}
{'id': '8', 'text': 'how has lack of testing availability led to underreporting of true incidence of Covid-19?'}
{'id': '9', 'text': 'how has COVID-19 affected Canada'}
{'id': '10', 'text': 'has social distancing had an impact on slowing the spread of COVID-19?'}


## Carregamento do Tokenizer

In [15]:
tokenizer = AutoTokenizer.from_pretrained(model_name)

Downloading (…)okenizer_config.json:   0%|          | 0.00/258 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/523 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

## Classe de dataset

In [16]:
class DatasetCovid(torch.utils.data.Dataset):
    def __init__(self, tokenizer, texts, max_seq_length = max_length):
        self.max_seq_length = max_seq_length
        self.tokenizer = tokenizer
        if type(texts[0]) is tuple:
            self.texts = [text[1] for text in texts]
        else:
            self.texts = texts
    def __len__(self):
        return len(self.texts)
    def __getitem__(self, idx):
        return self.tokenizer(self.texts[idx], padding=True, return_special_tokens_mask=True, add_special_tokens=True, truncation=True, max_length=self.max_seq_length)

## Função collate - padding

In [17]:
def collate_fn(batch):
    return BatchEncoding(tokenizer.pad(batch, return_tensors='pt'))

# Splade

## Carregamento do modelo

In [18]:
model = AutoModelForMaskedLM.from_pretrained(model_name).to(device)

Downloading pytorch_model.bin:   0%|          | 0.00/268M [00:00<?, ?B/s]

## Funções de expansão de termos para documentos

In [35]:
def splade_text(model, tokenizer, text, device):
    inputs = tokenizer(text, add_special_tokens=True, return_special_tokens_mask=True, return_tensors='pt', truncation=True, max_length=256)
    with torch.no_grad():
        outputs = model(input_ids = inputs['input_ids'].to(device), attention_mask = inputs['attention_mask'].to(device))
        logits = outputs.logits.squeeze()
        wj, _ = torch.max(torch.log(1 + relu(logits)), dim = 0)
        wj = wj.cpu().to_sparse()
        ids = wj.indices()
    return ids, wj.values(), tokenizer.convert_ids_to_tokens(ids.squeeze())

In [36]:
def splade_batch(model, tokenizer, batch, device):
    with torch.no_grad():
        outputs = model(input_ids = batch['input_ids'].to(device), attention_mask = batch['attention_mask'].to(device))
        logits = outputs.logits
        wj, _ = torch.max(torch.log(1 + relu(logits)), dim = 1)
    return wj.cpu().to_sparse()

In [37]:
def splade_dataloader(model, tokenizer, dataloader, device):
    wj = None
    for id_batch, batch in enumerate(tqdm(dataloader)):
        wj_batch = splade_batch(model, tokenizer, batch, device)
        if wj is None:
            wj = wj_batch
        else:
            wj = torch.cat((wj, wj_batch), dim = 0)
    return wj

In [38]:
def show_splade(texts, wj, tokenizer, quantity = 10):
    indices = wj.coalesce().indices()
    values = wj.coalesce().values()

    for sentence, text in enumerate(texts):
        print('Text', sentence, '-', text)
        sentence_mask = indices[0] == sentence
        ids = indices[1][sentence_mask].numpy()
        tokens = tokenizer.convert_ids_to_tokens(ids)
        zip_list = sorted(list(zip(values, tokens)), reverse = True)
        for value, token in zip_list[:quantity]:
            print('\t' + token, '-', value.item())  

# Execução de testes

## Dataset

In [39]:
texts_test = [
    'I love taking long walks on the beach at sunset.',
    'She was so nervous about the job interview that she could hardly sit still.',
    'The smell of fresh-baked cookies always makes me feel happy.',
    'Despite the rain, the outdoor concert was still a huge success.',
    "He couldn't believe how quickly time had passed since he graduated from college."
]
# ['where eat pizza', 'what about the weather today', 'how to achieve wisdom', 'what is the capital of Australia', 'when europeans founded America']

dataset_test = DatasetCovid(tokenizer, texts_test, max_seq_length = 20)

for i in range(len(dataset_test)):
    print(i, '-', tokenizer.decode(dataset_test[i]['input_ids']))
    print('\t', dataset_test[i])

0 - [CLS] i love taking long walks on the beach at sunset. [SEP]
	 {'input_ids': [101, 1045, 2293, 2635, 2146, 7365, 2006, 1996, 3509, 2012, 10434, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'special_tokens_mask': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}
1 - [CLS] she was so nervous about the job interview that she could hardly sit still. [SEP]
	 {'input_ids': [101, 2016, 2001, 2061, 6091, 2055, 1996, 3105, 4357, 2008, 2016, 2071, 6684, 4133, 2145, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'special_tokens_mask': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}
2 - [CLS] the smell of fresh - baked cookies always makes me feel happy. [SEP]
	 {'input_ids': [101, 1996, 5437, 1997, 4840, 1011, 17776, 16324, 2467, 3084, 2033, 2514, 3407, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'special_tokens_mask': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}
3 - [CLS] despite the rain, the outdoo

## Dataloader

In [40]:
dataloader_test = DataLoader(dataset_test, batch_size=2, shuffle=False, collate_fn=collate_fn)

for batch_id, batch in enumerate(dataloader_test):
    print('Batch', batch_id, '- size:', len(batch['input_ids']))
    for i in range(len(batch['input_ids'])):  
        print('\tText', i, '-', tokenizer.decode(batch['input_ids'][i]))
        for key in batch.keys():
            print('\t\t' + key + ':', batch[key][i])

Batch 0 - size: 2
	Text 0 - [CLS] i love taking long walks on the beach at sunset. [SEP] [PAD] [PAD] [PAD] [PAD]
		input_ids: tensor([  101,  1045,  2293,  2635,  2146,  7365,  2006,  1996,  3509,  2012,
        10434,  1012,   102,     0,     0,     0,     0])
		attention_mask: tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0])
		special_tokens_mask: tensor([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
	Text 1 - [CLS] she was so nervous about the job interview that she could hardly sit still. [SEP]
		input_ids: tensor([ 101, 2016, 2001, 2061, 6091, 2055, 1996, 3105, 4357, 2008, 2016, 2071,
        6684, 4133, 2145, 1012,  102])
		attention_mask: tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
		special_tokens_mask: tensor([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1])
Batch 1 - size: 2
	Text 0 - [CLS] the smell of fresh - baked cookies always makes me feel happy. [SEP]
		input_ids: tensor([  101,  1996,  5437,  1997,  4840,  1011, 17776, 16324,  2467,

## Splade - sem dataloader

In [42]:
for text in texts_test:
    print(text)
    _, values, tokens = splade_text(model, tokenizer, text, device)
    for value, token in sorted(list(zip(values, tokens))[:10], reverse = True):
        print('\t' + token, '-', value.item())

I love taking long walks on the beach at sunset.
	you - 0.8841824531555176
	on - 0.7934272289276123
	at - 0.7721477746963501
	i - 0.5740147829055786
	to - 0.4312276840209961
	her - 0.29871657490730286
	. - 0.2655070424079895
	my - 0.21907301247119904
	it - 0.16131193935871124
	him - 0.07204177230596542
She was so nervous about the job interview that she could hardly sit still.
	her - 2.030452013015747
	she - 1.9141789674758911
	was - 1.195767879486084
	him - 1.0662636756896973
	. - 0.5281599164009094
	had - 0.33894625306129456
	not - 0.27704575657844543
	his - 0.27175912261009216
	would - 0.26201331615448
	it - 0.20605656504631042
The smell of fresh-baked cookies always makes me feel happy.
	me - 0.7531765103340149
	new - 0.413139671087265
	of - 0.37091371417045593
	. - 0.36289894580841064
	you - 0.30269885063171387
	like - 0.2823449373245239
	him - 0.202437624335289
	my - 0.16390003263950348
	her - 0.023230422288179398
	in - 0.022699983790516853
Despite the rain, the outdoor concert w

## Splade - com Dataloader

In [43]:
wj = splade_dataloader(model, tokenizer, dataloader_test, device)
show_splade(texts_test, wj, tokenizer, 10)

  0%|          | 0/3 [00:00<?, ?it/s]

Text 0 - I love taking long walks on the beach at sunset.
	sunset - 2.231519937515259
	beach - 2.2043275833129883
	long - 1.8040367364883423
	walk - 1.6086158752441406
	sunrise - 1.588561773300171
	love - 1.4954960346221924
	walking - 1.3851253986358643
	walks - 1.2239038944244385
	beaches - 1.1227279901504517
	take - 0.9775022268295288
Text 1 - She was so nervous about the job interview that she could hardly sit still.
	nerves - 2.231519937515259
	interview - 2.2043275833129883
	too - 1.8040367364883423
	sitting - 1.6086158752441406
	interviewing - 1.588561773300171
	police - 1.4954960346221924
	barely - 1.3851253986358643
	monica - 1.2239038944244385
	anxious - 1.1227279901504517
	couldn - 0.9775022268295288
Text 2 - The smell of fresh-baked cookies always makes me feel happy.
	scent - 2.231519937515259
	eat - 2.2043275833129883
	oven - 2.030451774597168
	baked - 1.9141793251037598
	song - 1.8040367364883423
	store - 1.6086158752441406
	mood - 1.588561773300171
	feel - 1.495496034622

# Expansão - Corpus TREC-COVID

## Dataset e Dataloader

In [45]:
dataset_covid = DatasetCovid(tokenizer, corpus, max_seq_length = max_length)
dataloader_covid = DataLoader(dataset_covid, batch_size=5, shuffle=False, collate_fn=collate_fn)

## Splade

In [46]:
wj = splade_dataloader(model, tokenizer, dataloader_covid, device)

  0%|          | 0/34267 [00:00<?, ?it/s]

In [47]:
wj._nnz(), wj.numel()

(53199286, 5229395304)

## Salvamento da matriz esparsa

In [49]:
import pickle

with open('wj.pkl','wb') as f:
      pickle.dump(wj, f, pickle.HIGHEST_PROTOCOL)

# Busca - queries TREC-COVID

## Carga da matriz esparsa

In [None]:
with open('wj.pkl','rb') as f:
    wj = pickle.loads(f)

## Teste com 5 queries e 10 melhores resultados

In [74]:
for query in queries[:5]:
    print('Query', query['id'], '-', query['text'])   
    input_ids = tokenizer(query['text'], padding=True, return_special_tokens_mask=True, add_special_tokens=True, truncation=True, max_length=max_length)['input_ids']
    print('\tinput_ids:', input_ids)
    q = torch.zeros(tokenizer.vocab_size)
    q[input_ids] = 1.
    scores = torch.matmul(wj, q.unsqueeze(dim = 1)).squeeze()
    #print(wj.size(), q.unsqueeze(dim = 1).size(), scores.size())
    sorted_scores, indices_scores = torch.sort(scores, descending=True)
    sorted_scores = sorted_scores[:10]
    indices_scores = indices_scores[:10]
    for i in range(10):
        print('\t', i, '- score:', sorted_scores[i].item(), '- doc:', corpus[indices_scores[i]])

Query 1 - what is the origin of COVID-19
	input_ids: [101, 2054, 2003, 1996, 4761, 1997, 2522, 17258, 1011, 2539, 102]
	 0 - score: 11.630707740783691 - doc: ('rzpbpxw2', 'What is COVID-19? ')
	 1 - score: 10.543111801147461 - doc: ('gdfxiosc', 'What is COVID‐19? ')
	 2 - score: 10.498983383178711 - doc: ('cgvj10r2', 'Cerebrovascular Disease in COVID-19 Coronavirus disease 19 (COVID-19) is a pandemic originating in Wuhan, China, in December 2019. Early reports suggest that there are neurologic manifestations of COVID-19, including acute cerebrovascular disease. We report a case of COVID-19 with acute ischemic stroke. To our knowledge, this is the first reported case of COVID-19-related cerebral infarcts that includes brain imaging at multiple time points and CT angiography. There is a growing body of published evidence that complications of COVID-19 are not limited to the pulmonary system. Neuroradiologists should be aware of a wide range of neurologic manifestations, including cerebro

## Geração do resultado

In [79]:
%%time
with open('run-trec-covid-splade.txt', 'w') as runfile:
    for query_id, query in enumerate(queries):
        input_ids = tokenizer(query['text'], padding=True, return_special_tokens_mask=True, add_special_tokens=True, truncation=True, max_length=max_length)['input_ids']
        q = torch.zeros(tokenizer.vocab_size)
        q[input_ids] = 1.
        scores = torch.matmul(wj, q.unsqueeze(dim = 1)).squeeze()
        sorted_scores, indices_scores = torch.sort(scores, descending=True)
        sorted_scores = sorted_scores[:1000]
        indices_scores = indices_scores[:1000]
        ids_docs = [corpus[i][0] for i in indices_scores]
        for i, (id_doc, score) in enumerate(zip(ids_docs, sorted_scores)):
            runfile.write(f'{query_id+1} Q0 {id_doc} {i+1} {float(score):.6f} Splade\n')        

CPU times: user 1min, sys: 104 ms, total: 1min
Wall time: 1min 1s


In [80]:
!head run-trec-covid-splade.txt

1 Q0 rzpbpxw2 1 11.630708 Splade
1 Q0 gdfxiosc 2 10.543112 Splade
1 Q0 cgvj10r2 3 10.498983 Splade
1 Q0 1mjaycee 4 10.445834 Splade
1 Q0 0wm6u10a 5 10.336140 Splade
1 Q0 pu9l36j9 6 10.301398 Splade
1 Q0 n13hg2yd 7 10.296379 Splade
1 Q0 hh7zzzbk 8 10.256858 Splade
1 Q0 vx7ebtbp 9 10.116970 Splade
1 Q0 sh7lrdou 10 10.111296 Splade


# Avaliação

 ## Cálculo do nDCG@10

In [81]:
!python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 test_adjusted.tsv run-trec-covid-splade.txt

Downloading https://search.maven.org/remotecontent?filepath=uk/ac/gla/dcs/terrierteam/jtreceval/0.0.5/jtreceval-0.0.5-jar-with-dependencies.jar to /root/.cache/pyserini/eval/jtreceval-0.0.5-jar-with-dependencies.jar...
/root/.cache/pyserini/eval/jtreceval-0.0.5-jar-with-dependencies.jar already exists!
Skipping download.
Running command: ['java', '-jar', '/root/.cache/pyserini/eval/jtreceval-0.0.5-jar-with-dependencies.jar', '-c', '-m', 'ndcg_cut.10', 'test_adjusted.tsv', 'run-trec-covid-splade.txt']
Results:
ndcg_cut_10           	all	0.6020


In [82]:
!head test_adjusted.tsv

1	0	005b2j4b	2
1	0	00fmeepz	1
1	0	g7dhmyyo	2
1	0	0194oljo	1
1	0	021q9884	1
1	0	02f0opkr	1
1	0	047xpt2c	0
1	0	04ftw7k9	0
1	0	pl9ht0d0	0
1	0	05vx82oo	0
