# TUTORIAL

In this notebook, we will show how to access the documents in the different VerbCL collections:
* VerbCL Opinions
* VerbCL CitationGraph
* VerbCL Highlights

We access ElasticSearch through the module `elasticsearch-dsl`, refer to the [Documentation](https://elasticsearch-dsl.readthedocs.io/en/latest/) for more information. All the ElasticSearch queries are available through this module.

The documents are described in the file `utils/elastic.py` as the classes `OpinionDocument`, `OpinionCitationGraph` and `OpinionSentence`.

In [None]:
import sys
sys.path.append('../')

In [None]:
import pandas as pd

from utils import elastic_init
from utils import OpinionCitationGraph, OpinionSentence, OpinionDocument

from elasticsearch_dsl import connections

In [None]:
pd.set_option("display.max_colwidth", None)

# Connect to ElasticSearch

Adjust the environment variables in the file `elastic_local.env`

In [None]:
alias = elastic_init("elastic-local.env")
print(alias)

In [None]:
OPINION_ID = 1239944

# VerbCL Opinions

We will retrieve the opinion 1239944 used as an example in the paper.

In [None]:
s = OpinionDocument.search(using=alias).query("match", opinion_id=OPINION_ID)

In [None]:
assert s.count() == 1

In [None]:
retrieved = list(s.scan())

In [None]:
opinion = retrieved[0]

In [None]:
print(opinion.raw_text)

# VerbCL Citation Graph

We will retrieve all the opinions that cite this opinion 1239944.

In [None]:
s = OpinionCitationGraph.search(using=alias).query("match", cited_opinion_id=OPINION_ID).filter("range", score={"gt": -1})
scan = s.scan()

In [None]:
citings = pd.DataFrame([r.to_dict() for r in scan])

In [None]:
citings

# VerbCL Highlights

We will retrieve all the highlights from opion 1239944.

In [None]:
s = OpinionSentence.search(using=alias).query("match", opinion_id=OPINION_ID).query("match", highlight=True)
scan = s.scan()

In [None]:
highlights = pd.DataFrame([r.to_dict() for r in scan])

In [None]:
highlights[['sentence_id', 'count_citations', 'raw_text']]