# The EP full-text library - Lesson 3
This notebook expands on lesson 3 to dive into more advanced concepts of EPAB, the implementation in TIP of the EP full-text library. We will introduce iterative result processing and result browsing. As we did in the first notebook, we first create an instance of the EPAB library. Remember that by default we are getting access to a test database. For this lesson we will work with access to the full database. 

In [10]:
# Importing the EPAB client
from epo.tipdata.epab import EPABClient

# creating an instance of the EPAB client with the PROD database
epab = EPABClient(env='PROD')


## Iterative result processing
When we work with the Production database, it is likely that some queries will retrieve very large number of publications. We have seen the `get_results()` method for getting data from the result of a query. This method will get the data for all the publications resulting from the relevant query in one pass.

In [9]:
# We query for publications within the wireless communications field
q = epab.query_ipc("H04W%")
# Let's see the size of the results object
print (f'Our query results contains ', q)

Our query results contains  183817 publications


### Getting all the results in one go
We can now decide to get all the results in one go, using the `get_results()` method that we know. With this size of query we can run into memory problems, or otherwise overload our workspace. 

In [11]:
all_results = q.get_results('title.en')

#displaying all the results in a dataframe
print (f'The amount of results downloaded in one go is', len(all_results))

The amount of results downloaded in one go is 183818


### Getting the results in batches
For queries of this size, particularly when you want to get more data than just the title, such as the full text of the description, it is a good idea to use the `iterator()` method. In the example below we will get the results in batches of 5000 documents. 

In [14]:
fetched = 0

# We call the iterator method and ask for batches of 5000 results
for batch in q.iterator("title.en", batch_size=5000):
    #the size of the batch, for didactic purposes
    batch_size = len(batch)

    #we add the fetched batch to the total amount of fetched documents
    fetched += batch_size
    
    #displaying the batch fetching operation
    print(f"In this iteration I have fetched {batch_size} publications. Total fetched: {fetched}")


In this iteration I have fetched 5000 publications. Total fetched: 5000
In this iteration I have fetched 5000 publications. Total fetched: 10000
In this iteration I have fetched 5000 publications. Total fetched: 15000
In this iteration I have fetched 5000 publications. Total fetched: 20000
In this iteration I have fetched 5000 publications. Total fetched: 25000
In this iteration I have fetched 5000 publications. Total fetched: 30000
In this iteration I have fetched 5000 publications. Total fetched: 35000
In this iteration I have fetched 5000 publications. Total fetched: 40000
In this iteration I have fetched 5000 publications. Total fetched: 45000
In this iteration I have fetched 5000 publications. Total fetched: 50000
In this iteration I have fetched 5000 publications. Total fetched: 55000
In this iteration I have fetched 5000 publications. Total fetched: 60000
In this iteration I have fetched 5000 publications. Total fetched: 65000
In this iteration I have fetched 5000 publications. 

## Browsing the results
A powerful feature of the EPAB library is the possibility to browse the results of a query with a rich download of all the data for each publication. This browsing is a virtual equivalent to seeing each whole publication resulting from the query. The method `browse_results` creeates a widget that allows you to download batches of 10 publications and inspect them. 

In [15]:
# instantiating a browser object
browser = q.browse_results()

# we now run the browser 
browser

WidEPABPublicationsBrowser(btn_load_more=Btn(children=['Load more..'], color='primary', layout=None), header='…

### Controlling the browser with Python
You see that the first 10 publications have been downloaded into the widget and you can inspect each one accessing the bibliographic data, the full text fields in a richly formatted way, and also the drawings and search report. The widget includes a `load more` button allowing to continue browsing batches of 10 publications. This browser object can also be controlled by python with several methods such as pagination, browsing only a given publication within the results

In [16]:
# Asking the browser to show the next publication
browser.next()

In [17]:
# Asking the browser to show the previous publication
browser.previous()

In [18]:
# Asking the browser to show a specific publication based on its index
browser.selected_pub = 2