## Example 1: Loading tables into the paper database and filter

This notebook demonstrates how to load the paper database from various optional sub-tables of interest. It also shows how to perform simple filtering options on the table.

In [1]:
%load_ext autoreload
%autoreload 2

import os
os.chdir("..")

from src.db import Database

First, we will specify some tables to load. In this example, the full list of possible tables includes:
* active_inference
* bayesian_mechanics
* free_energy
* friston
* karl_friston
* predictive_coding
* predictive_processing

Each of these tables represents the output from PubMed after searching for each of these terms. One can create the database from any combination of the above tables. We will pick two for our example. Then, we just use the `load()` function to load the database.

### 1.1 Data loading

In [3]:
tables = ["active_inference", "bayesian_mechanics"]
database = Database()
database.load(tables=tables)

INFO:src.db:Checking tables...
INFO:src.db:Loading tables...
INFO:src.db:Tables downloaded from PubMed on Thursday, Sept. 14, 2023.


### 1.2 Data filtering

First, let's just view the full database object.

In [4]:
# View database object
database.db

Unnamed: 0,id,title,authors,citation,first_author,journal_book,publication_year,create_date,pmcid,nihms_id,doi
0,34968557,Active inference leads to Bayesian neurophysio...,Isomura T.,Neurosci Res. 2022 Feb;175:38-45. doi: 10.1016...,Isomura T,Neurosci Res,2022,2021/12/30,,,10.1016/j.neures.2021.12.003
1,33343039,Active inference on discrete state-spaces: A s...,"Da Costa L, Parr T, Sajid N, Veselic S, Neacsu...",J Math Psychol. 2020 Dec;99:102447. doi: 10.10...,Da Costa L,J Math Psychol,2020,2020/12/21,PMC7732703,,10.1016/j.jmp.2020.102447
2,30120751,"Active Inference, Novelty and Neglect","Parr T, Friston KJ.",Curr Top Behav Neurosci. 2019;41:115-128. doi:...,Parr T,Curr Top Behav Neurosci,2019,2018/08/19,,,10.1007/7854_2018_61
3,27375276,Active inference and learning,"Friston K, FitzGerald T, Rigoli F, Schwartenbe...",Neurosci Biobehav Rev. 2016 Sep;68:862-879. do...,Friston K,Neurosci Biobehav Rev,2016,2016/07/05,PMC5167251,,10.1016/j.neubiorev.2016.06.022
4,34687699,"Active inference, selective attention, and the...","Holmes E, Parr T, Griffiths TD, Friston KJ.",Neurosci Biobehav Rev. 2021 Dec;131:1288-1304....,Holmes E,Neurosci Biobehav Rev,2021,2021/10/23,PMC8643962,NIHMS1754010,10.1016/j.neubiorev.2021.09.038
...,...,...,...,...,...,...,...,...,...,...,...
3,37154143,Really radical?,Friston K.,Behav Brain Sci. 2023 May 8;46:e93. doi: 10.10...,Friston K,Behav Brain Sci,2023,2023/05/08,,,10.1017/S0140525X2200276X
4,33286324,Modules or Mean-Fields?,"Parr T, Sajid N, Friston KJ.",Entropy (Basel). 2020 May 14;22(5):552. doi: 1...,Parr T,Entropy (Basel),2020,2020/12/08,PMC7517075,EMS86658,10.3390/e22050552
5,36821891,From the free energy principle to a confederat...,"Aguilera M, Millidge B, Tschantz A, Buckley CL.",Phys Life Rev. 2023 Mar;44:270-275. doi: 10.10...,Aguilera M,Phys Life Rev,2023,2023/02/23,,,10.1016/j.plrev.2023.01.018
6,31865883,"Markov blankets, information geometry and stoc...","Parr T, Da Costa L, Friston K.",Philos Trans A Math Phys Eng Sci. 2020 Feb 7;3...,Parr T,Philos Trans A Math Phys Eng Sci,2020,2019/12/24,PMC6939234,,10.1098/rsta.2019.0159


From here, we have a number of different possible options to see more information about our papers. First, let's list the columns in the database and then select a column from the table.

In [5]:
# List columns in the database
database.list_columns()

Current columns: 
 ['id', 'title', 'authors', 'citation', 'first_author', 'journal_book', 'publication_year', 'create_date', 'pmcid', 'nihms_id', 'doi']


In [6]:
# Select a column from the table
database.select(fields="title").head(10)

0    Active inference leads to Bayesian neurophysio...
1    Active inference on discrete state-spaces: A s...
2                Active Inference, Novelty and Neglect
3                        Active inference and learning
4    Active inference, selective attention, and the...
5    An active inference perspective on the negativ...
6                    Active inference through whiskers
7     Active inference, communication and hermeneutics
8                      Resilience and active inference
9    An active inference model of hierarchical acti...
Name: title, dtype: object

### 1.3 Sorting, dropping, and filtering

Next we can sort, drop rows from the table, or filter by a specific field of interest. The filtering options are very simple and implemented for convenience and quick inspection. For more complex operations the user is recommended to use `Pandas` or a similar library.

In [7]:
# Sort the table
database.sort(fields=["title", "publication_year"], ascending=True).head(5)

Unnamed: 0,id,title,authors,citation,first_author,journal_book,publication_year,create_date,pmcid,nihms_id,doi
421,30984063,"""Surprise"" and the Bayesian Brain: Implication...","Holmes J, Nolte T.",Front Psychol. 2019 Mar 28;10:592. doi: 10.338...,Holmes J,Front Psychol,2019,2019/04/16,PMC6447687,,10.3389/fpsyg.2019.00592
164,32460940,"""Through others we become ourselves"": The dial...","Bolis D, Schilbach L.",Behav Brain Sci. 2020 May 28;43:e93. doi: 10.1...,Bolis D,Behav Brain Sci,2020,2020/05/29,,,10.1017/S0140525X19002917
211,29780343,'Seeing the Dark': Grounding Phenomenal Transp...,"Limanowski J, Friston K.",Front Psychol. 2018 May 4;9:643. doi: 10.3389/...,Limanowski J,Front Psychol,2018,2018/05/22,PMC5945877,,10.3389/fpsyg.2018.00643
170,33733186,A Bayesian Account of Generalist and Specialis...,"Chen AG, Benrimoh D, Parr T, Friston KJ.",Front Artif Intell. 2020 Sep 3;3:69. doi: 10.3...,Chen AG,Front Artif Intell,2020,2021/03/18,PMC7861269,,10.3389/frai.2020.00069
360,30381799,A Bayesian Account of Psychopathy: A Model of ...,"Prosser A, Friston KJ, Bakker N, Parr T.",Comput Psychiatr. 2018 Oct;2:92-140. doi: 10.1...,Prosser A,Comput Psychiatr,2018,2018/11/02,PMC6184370,,10.1162/cpsy_a_00016


In [8]:
# Drop rows from the table
database.drop_rows(ids=[30984063, 32460940]).head(5)

Unnamed: 0,id,title,authors,citation,first_author,journal_book,publication_year,create_date,pmcid,nihms_id,doi
0,34968557,Active inference leads to Bayesian neurophysio...,Isomura T.,Neurosci Res. 2022 Feb;175:38-45. doi: 10.1016...,Isomura T,Neurosci Res,2022,2021/12/30,,,10.1016/j.neures.2021.12.003
1,33343039,Active inference on discrete state-spaces: A s...,"Da Costa L, Parr T, Sajid N, Veselic S, Neacsu...",J Math Psychol. 2020 Dec;99:102447. doi: 10.10...,Da Costa L,J Math Psychol,2020,2020/12/21,PMC7732703,,10.1016/j.jmp.2020.102447
2,30120751,"Active Inference, Novelty and Neglect","Parr T, Friston KJ.",Curr Top Behav Neurosci. 2019;41:115-128. doi:...,Parr T,Curr Top Behav Neurosci,2019,2018/08/19,,,10.1007/7854_2018_61
3,27375276,Active inference and learning,"Friston K, FitzGerald T, Rigoli F, Schwartenbe...",Neurosci Biobehav Rev. 2016 Sep;68:862-879. do...,Friston K,Neurosci Biobehav Rev,2016,2016/07/05,PMC5167251,,10.1016/j.neubiorev.2016.06.022
4,34687699,"Active inference, selective attention, and the...","Holmes E, Parr T, Griffiths TD, Friston KJ.",Neurosci Biobehav Rev. 2021 Dec;131:1288-1304....,Holmes E,Neurosci Biobehav Rev,2021,2021/10/23,PMC8643962,NIHMS1754010,10.1016/j.neubiorev.2021.09.038


In [9]:
# Filter by field ---> Papers where Thomas Parr or Lance Da Costa are first authors
database.filter_by_field(field="first_author", terms=["Parr T", "Da Costa L"]).head(5)

Unnamed: 0,id,title,authors,citation,first_author,journal_book,publication_year,create_date,pmcid,nihms_id,doi
1,33343039,Active inference on discrete state-spaces: A s...,"Da Costa L, Parr T, Sajid N, Veselic S, Neacsu...",J Math Psychol. 2020 Dec;99:102447. doi: 10.10...,Da Costa L,J Math Psychol,2020,2020/12/21,PMC7732703,,10.1016/j.jmp.2020.102447
2,30120751,"Active Inference, Novelty and Neglect","Parr T, Friston KJ.",Curr Top Behav Neurosci. 2019;41:115-128. doi:...,Parr T,Curr Top Behav Neurosci,2019,2018/08/19,,,10.1007/7854_2018_61
25,35327872,How Active Inference Could Help Revolutionise ...,"Da Costa L, Lanillos P, Sajid N, Friston K, Kh...",Entropy (Basel). 2022 Mar 2;24(3):361. doi: 10...,Da Costa L,Entropy (Basel),2022,2022/03/25,PMC8946999,,10.3390/e24030361
32,37080424,Cognitive effort and active inference,"Parr T, Holmes E, Friston KJ, Pezzulo G.",Neuropsychologia. 2023 Jun 6;184:108562. doi: ...,Parr T,Neuropsychologia,2023,2023/04/20,,,10.1016/j.neuropsychologia.2023.108562
34,34803619,"Understanding, Explanation, and Active Inference","Parr T, Pezzulo G.",Front Syst Neurosci. 2021 Nov 5;15:772641. doi...,Parr T,Front Syst Neurosci,2021,2021/11/22,PMC8602880,,10.3389/fnsys.2021.772641


### 1.4 Add/remove papers and saving/loading the database

All the operations above do not actually alter the `database.db` object itself, only the output. The `add_papers_to_db()` and `remove_papers_from_db()` functions will mutate the `database.db` object. These functions will also generate a record of any added or removed file at an output path. 

The operation was originally conceived because PubMed does not have every paper, including pre-prints. The search operations also sometimes include papers that are not actually related to the query. Thus, one can remove irrelevant files or add files from different sources such as ArXiv.

To make the database as reproducible as possible, the add/remove methods leave a trace of files that were added or removed. This way one can always reload the PubMed tables and add/remove specific papers according to an add/remove history.


In [10]:
# Add papers

# Remove papers



To save the table run: `database.save("/path/to/output.csv")`.

### 1.5 Refreshing the table

If one desires to download the latest papers from PubMed, it is possible to integrate these updates into the new table. The `refresh_db()` function joins a newer db to an older exported db.

In [None]:
# Refresh table
# TODO