# Retrieve all papers from one (or more) Zotero collection

This code shows how to retrieve all papers from one of my Zotero collections. The eventual goal is to extract all of the abstracts from papers in this collection, and use the abstracts to make a word cloud. 

## Setup

In [76]:
from pyzotero import zotero
from pprint import pprint
import pickle

library_id=5791072
library_type='user'
api_key='302KXRMhZqNYGUIMe6lfHhOD'

## Retrieve all Zotero papers

This is done using the python package [Pyzotero](https://pyzotero.readthedocs.io/en/latest/). 


In [52]:
zot = zotero.Zotero(library_id, library_type, api_key)
my_library = zot.items()
print(f"Number of items in my library: {zot.count_items()}")
print(f"Number of items in my_library: {len(my_library)}")
print(f"\n Zotero item types: ")
pprint(zot.item_types())

print(f"\n Zotero item fields: ")
pprint(zot.item_fields())

print(f"\n Example items: ")
pprint(zot.items()[0:1])

Number of items in my library: 1106
Number of items in my_library: 100

 Zotero item types: 
[{'itemType': 'artwork', 'localized': 'Artwork'},
 {'itemType': 'audioRecording', 'localized': 'Audio Recording'},
 {'itemType': 'bill', 'localized': 'Bill'},
 {'itemType': 'blogPost', 'localized': 'Blog Post'},
 {'itemType': 'book', 'localized': 'Book'},
 {'itemType': 'bookSection', 'localized': 'Book Section'},
 {'itemType': 'case', 'localized': 'Case'},
 {'itemType': 'computerProgram', 'localized': 'Computer Program'},
 {'itemType': 'conferencePaper', 'localized': 'Conference Paper'},
 {'itemType': 'dictionaryEntry', 'localized': 'Dictionary Entry'},
 {'itemType': 'document', 'localized': 'Document'},
 {'itemType': 'email', 'localized': 'E-mail'},
 {'itemType': 'encyclopediaArticle', 'localized': 'Encyclopedia Article'},
 {'itemType': 'film', 'localized': 'Film'},
 {'itemType': 'forumPost', 'localized': 'Forum Post'},
 {'itemType': 'hearing', 'localized': 'Hearing'},
 {'itemType': 'instantMe

Store only the items that are in my collection of interest. 

Note, from the previous tutorial `ZoteroCollectionTreeTutorial`, I know that I want to extract 76 papers from `Z6S8BR85: Primate Hc Ephys`

_**Pseudocode**:_
```
store collection key (or keys) of interest called include_collection
create empty list of papers called include_papers

for each item in my zotero library:
    if it is in any of the collections listed in include_collection:
        append item to include_papers
```

Note, there should be some way to do the search and retrieval more efficiently, but I couldn't quickly figure out how to use the `zot.items()` search functionality. Documentation was too poor in my opinion. 

Note, the Pyzotero documentation states: 
>The Read API returns 25 results by default (the API documentation claims 50). In the interests of usability, Pyzotero returns 100 items by default, by setting the API `limit` parameter to 100, unless it’s set by the user. If you wish to retrieve e.g. all top-level items without specifiying a `limit` parameter, you’ll have to wrap your call with **`Zotero.everything()`**: `results = zot.everything(zot.top())`.

In [56]:
include_collection = 'Z6S8BR85'
include_papers = []

for item in zot.everything(zot.items()):
    if 'collections' in item['data'] and include_collection in item['data']['collections']:
        print("Found an item from the collection!")
        include_papers.append(item)
        
print(f"Number of items retrieved: {len(include_papers)}")


Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from the collection!
Found an item from t

## Extract abstracts from these papers, and save them in a file to be used for the word cloud

In [65]:
print(type(include_papers[0]['data']['abstractNote']))

('we have undertaken a series of electrophysiological studies in the squirrel '
 'monkey to explore the possible relationships of the limbic cortex to the '
 'visual apparatus. This paper will give the results of exploration of the '
 'ventral hippocampal formation for slow and unit potentials evoked by photic '
 'stimulation. A comparison will be given of the form, distribution, and '
 'latency of these responses with those obtained by electrical stimulation of '
 'the olfactory tract, posterior cingulate gyrus and septum. In ,discussing '
 'the possible significance of these findings, reference will be made to work '
 'in progress on the posterior cingulate and retrosplenial areas.')
<class 'str'>


In [73]:
concatenated_abstracts = str()
print(concatenated_abstracts)
for item in include_papers:
    concatenated_abstracts = concatenated_abstracts+' '+item['data']['abstractNote']
    
print(concatenated_abstracts)

In [78]:
# Save text file of concatenated abstracts
file = open('concatenated_abstracts.txt', 'wb')
pickle.dump(concatenated_abstracts, file)
file.close()