In [None]:
#| hide
%load_ext autoreload
%autoreload 2

In [None]:
#| hide
from pubmed_lib.search import *
from pubmed_lib.data import *
from nbdev.showdoc import *
from dotenv import load_dotenv
from Bio import Entrez
import os

In [None]:
#|hide
load_dotenv('pass.env')

True

# pubmed_lib

> Library to search and parse data from pubmed. This library has all the functions to search for a author or affiliation, get publications, authors and some visualizations

## Install

```sh
pip install pubmed_lib
```

## How to use

This library has the basic functionalities to retreive data from pubmed search to be used with Langchain for Q&A systems, it is a simple wrapper from the Biopython

First you need to create a search object, with the default parameter you will use in the search

In [None]:
show_doc(Search, )

---

[source](https://github.com/Dmaturana81/pubmed_lib/blob/main/pubmed_lib/search.py#LNone){target="_blank" style="float:right; font-size:smaller"}

### Search

>      Search (search_tag:str=None, retmax:int=200, retmode:str='xml',
>              sort:str='relevance', mindate:Optional[int]=None,
>              maxdate:Optional[int]=None, idlist:Optional[List[int]]=None,
>              email:str=None, api_key:str=None)

Search class to warp the search and results

Here we will create the object with a max number of results of 10

the search_tag correspondes to the tag where you will do the search, the options available are the following:

In [None]:
#| hide_input
for k in SEARCH_TAGS.keys():
    print(f"{k}")

Affiliation
All Fields
Article Identifier
Author
Author Identifier
EC/RN Number
First Author Name
Full Author Name
Full Investigator Name
Grant Number
Investigator
Journal
Last Author Name
Location ID
MeSH Major Topic
MeSH Subheadings
MeSH Terms
Other Term
PMID
Subset
Text Words
Title
Title/Abstract


By defaults is setup to search in Title/Abstract

In [None]:
search = Search(retmax=10)

To actually do the search, you need to call the method search and give the query

In [None]:
show_doc(Search.search)

---

[source](https://github.com/Dmaturana81/pubmed_lib/blob/main/pubmed_lib/search.py#LNone){target="_blank" style="float:right; font-size:smaller"}

### Search.search

>      Search.search (query:str)

It receive a query to be searched in pubmed and return the handler of the search

|    | **Type** | **Details** |
| -- | -------- | ----------- |
| query | str | Query to be search in pubmed |

In [None]:
results = search.search('Bi-functional degraders in cancer')

NameError: name 'email' is not defined

In [None]:
results

['35285613', '33672989', '23749892', '17310834', '35644005', '16870428', '29587668', '21269262', '25685909', '27815492']

to fetch the results you need to call the fetch_details method, and pass the list of pubmedIds retreive previously

In [None]:
articles = search.fetch_details(results)

This will give you the xml data retreived from pubmed

In order to retreive the parsed resutls, you should use the method results

In [None]:
show_doc(Search.results)

---

[source](https://github.com/Dmaturana81/pubmed_lib/blob/main/pubmed_lib/search.py#LNone){target="_blank" style="float:right; font-size:smaller"}

### Search.results

>      Search.results (query:str)

Method that do the search and retrieve a generator with all the infomration of the articles

|    | **Type** | **Details** |
| -- | -------- | ----------- |
| query | str | Term to be queried in pubmed |
| **Returns** | **Generator** |  |

In [None]:
results = search.results('Bi-functional degraders in cancer')

In [None]:
res = list(results)

In [None]:
res[0]

Result(pubmed='35285613', pmc=None, doi='10.1021/acsabm.1c01216', pii=None, abstract="Gold nanorods (AuNRs) remain well-developed inorganic nanocarriers of small molecules for a plethora of biomedical and therapeutic applications. However, the delivery of therapeutic proteins using AuNRs with high protein loading capacity (LC), serum stability, excellent target specificity, and minimal off-target protein release is not known. Herein, we report two bi-functional AuNR-protein nanoconjugates, AuNR@EGFP-BSA<sub>FA</sub> and AuNR@RNaseA-BSA<sub>FA</sub>, supramolecularly coated with folic acid-modified BSA (BSA<sub>FA</sub>) acting as biomimetic protein corona to demonstrate targeted cytosolic delivery of enhanced green fluorescent protein (EGFP) and therapeutic ribonuclease A enzyme (RNase A) in their functional forms. AuNR@EGFP-BSA<sub>FA</sub> and AuNR@RNaseA-BSA<sub>FA</sub> exhibit high LCs of ∼42 and ∼54%, respectively, increased colloidal stability, and rapid protein release in the p

In [None]:
res[0].dict()

{'pubmed': '35285613',
 'pmc': None,
 'doi': '10.1021/acsabm.1c01216',
 'pii': None,
 'abstract': "Gold nanorods (AuNRs) remain well-developed inorganic nanocarriers of small molecules for a plethora of biomedical and therapeutic applications. However, the delivery of therapeutic proteins using AuNRs with high protein loading capacity (LC), serum stability, excellent target specificity, and minimal off-target protein release is not known. Herein, we report two bi-functional AuNR-protein nanoconjugates, AuNR@EGFP-BSA<sub>FA</sub> and AuNR@RNaseA-BSA<sub>FA</sub>, supramolecularly coated with folic acid-modified BSA (BSA<sub>FA</sub>) acting as biomimetic protein corona to demonstrate targeted cytosolic delivery of enhanced green fluorescent protein (EGFP) and therapeutic ribonuclease A enzyme (RNase A) in their functional forms. AuNR@EGFP-BSA<sub>FA</sub> and AuNR@RNaseA-BSA<sub>FA</sub> exhibit high LCs of ∼42 and ∼54%, respectively, increased colloidal stability, and rapid protein rel