```
title: "Grewpy • request"
date: 2024-04-22
```

[`grewpy` Tutorial](../tutorial)

# Grewpy tutorial: Run requests on a corpus

Download the notebook [here](../request.ipynb).

In [None]:
import grewpy
from grewpy import Corpus, Request

grewpy.set_config("sud") # ud or basic

## Import data
The `Corpus` constructor takes a `conllu` file or a directory containing `conllu` files.
A `Corpus` allows to make queries and to count occurrences.

In [None]:
treebank_path = "SUD_English-PUD"
corpus = Corpus(treebank_path)
print(type(corpus))

In [None]:
n_sentencens = len(corpus)
sent_ids = corpus.get_sent_ids()

print(f"{n_sentencens = }")
print(f"{sent_ids[0] = }")

## Explore data
See the [Grew-match tutorial](https://universal.grew.fr/?corpus=UD_English-ParTUT@2.14) to practice writing Grew requests

### Count the number of subjets in the corpus

In [None]:
req1 = Request("pattern { X-[subj]->Y }")
corpus.count(req1)

It is possible to extend an already existing request with the methods `pattern`, `without` and `with_` (because `with` is a Python keyword).
Hence, the request `req1bis` below is equivalent to `req1`.

In [None]:
req1bis = Request().pattern("X-[subj]->Y")
corpus.count(req1bis)

### Count the number of subjects such that the subject's head is not a pronoun

In [None]:
req2 = Request().pattern("X-[subj]->Y").without("Y[upos=PRON]")
corpus.count(req2)

### Count the number of subjects with at least one dependant
Note the usage of `with_` (because `with` is a Python keyword)

In [None]:
req3 = Request().pattern("X-[subj]->Y").with_("Y->Z")
corpus.count(req3)

### `with` and `without` items can be stacked 


In [None]:
req4 = Request().pattern("X-[subj]->Y").with_("Y->Z").without("Y[upos=PRON]").without("X[upos=VERB]")
corpus.count(req4)

### Building a request with the raw Grew syntax
It is possible to build request directly from the concrete syntax used in Grew-match or in Grew rules.
The `req4` can be written:

In [None]:
req4bis = Request("""
pattern { X-[subj]->Y }
with { Y->Z }
without { Y[upos=PRON] }
without { X[upos=VERB] }
""")
corpus.count(req4bis)

### More complex queries are allowed, with results clustering
See [Clustering](../../doc/clustering) for more documentation.
Below, we cluster the subject relation, according to the POS of the governor.

In [None]:
req5 = Request("pattern {X-[subj]->Y}")
corpus.count(req5, clustering_parameter=["X.upos"])

### Clustering results by other requests
The clustering is done on the relative position of `X` and `Y`.
It answers to the question: _How many subjects are in a pre-verbal position?_

In [None]:
corpus.count(req5, clustering_parameter=["{X << Y}"])

### Two clusterings can be applied

In [None]:
corpus.count(req5, clustering_parameter=["{X << Y}","X.upos"])

### More than two clusterings are also possible

In [None]:
corpus.count(req5, clustering_parameter=["{X << Y}","X.upos", "{X[Number=Sing]}"])

### Search occurrences
Get the list of occurrence of a given request in the corpus

In [None]:
occurrences = corpus.search(req1)
assert len(occurrences) == corpus.count(req1)
occurrences[0]

### Get occurrences including edges
The edge is named `e`, and the label of the dependency is reported in the output

In [None]:
req6 = Request().pattern("e: X->Y; X[upos=VERB]")
corpus.search(req6)[3]

### As with `count`, we can cluster the results of a `search`

In [None]:
result = corpus.search(req6, clustering_parameter=["{X << Y}"])
result.keys()