## Example 2: Creating and adding tags to the database

In [1]:
%load_ext autoreload
%autoreload 2

import os
os.chdir("..")

from src.tags import Tags


### 2.1 Creating and loading a tag file

If you already have a tag file, you can just load it from a specific path. Here we will assume that no tag file exists yet. We will create one first and then load it.

In [2]:
# Initialize Tag class
tags = Tags()

# Create an empty tag file (make sure to give it the yaml extension)
tags.create_empty_tag_file("data/tag_files/example_tag_file.yaml")

# Load the tag file
tags.load("data/tag_files/example_tag_file.yaml")

INFO:src.tags:Empty tag YAML output to data/tag_files/example_tag_file.yaml.
INFO:src.tags:YAML tag file successfully loaded from data/tag_files/example_tag_file.yaml.


Now let's view the current tag file.

In [3]:
# View the currently loaded tag file
tags.tags

{'category': '', 'tag_list': [], 'tagged_papers': []}

Not very inspiring. This is an empty tag file - a dict with two elements, each containing a list. The `tag_list` key contains a list of possible tags one can add. The `tagged_papers` key will be a dict with a paper `id` and the `tag`s (from the tag list) that are associated with that id. This will be demonstrated below.

### 2.2 Populating the tag file

One can manually update YAML tag file or use the helper methods part of the `Tag` class. Let's add some new tags to the tag list and view them.

In [4]:
from src.db import Database

In [5]:
tables = ["active_inference", "bayesian_mechanics"]
database = Database()
database.load(tables=tables)

INFO:src.db:Checking tables...
INFO:src.db:Loading tables...
INFO:src.db:Tables downloaded from PubMed on Thursday, Sept. 14, 2023.


In [6]:
# Set the tag category and add new tags to the tag list
tags.set_tag_list_category("Bayesian mechanics")
tags.add_to_tag_list(["information geometry", "Markov blankets"])

# View the current tag list
tags.view_tag_list()

INFO:src.tags:Set Bayesian mechanics as the tag category.
INFO:src.tags:Added ['information geometry', 'Markov blankets'] to the tag list.


Current tags: 
 ['information geometry', 'Markov blankets']


Now we can populate the `tagged_papers` key by associating tags with a paper id.

In [7]:
# Associate a tag with a paper id
tags.associate_tag_with_id(tag_id=31865883, tags=["information geometry"], db=database)
tags.associate_tag_with_id(tag_id=35153603, tags=["Markov blankets"], db=database)
tags.tagged_papers

INFO:src.tags:Added ['information geometry'] to 31865883.
INFO:src.tags:Added ['Markov blankets'] to 35153603.


[{'id': 31865883, 'tag': ['information geometry']},
 {'id': 35153603, 'tag': ['Markov blankets']}]

Now if we want to add more tags to an id we just run the same call again with the new tag.

In [8]:
# Add additional tags
tags.associate_tag_with_id(tag_id=35153603, tags=["thermodynamics", "sparse coupling"], db=database)
tags.tagged_papers

INFO:src.tags:Added ['thermodynamics', 'sparse coupling'] to 35153603.


[{'id': 31865883, 'tag': ['information geometry']},
 {'id': 35153603,
  'tag': ['Markov blankets', 'thermodynamics', 'sparse coupling']}]

## Adding tags interactively

Adding papers one at a time using `associate_tag_with_id` can be tedious. Because I had to this for every untagged paper in the database, I created an interactive mode that helps you tag quicker. 

This was mostly built for my convenience since I had to look through a lot of papers.

In [14]:
tags.add_tags_interactive(db=database, tags=tags)

INFO:src.interactive:Currently 2/499 papers tagged.


[>-----------------------------------------------------------------------------------------------------]



ID: 34968557
Title: Active inference leads to Bayesian neurophysiology
Authors: Isomura T.
Year: 2022
Current tags: no tags added


INFO:src.interactive:Currently 2/499 papers tagged.


You are at the first paper in the database. There are no previous papers.


****************************************************************************************************
[>-----------------------------------------------------------------------------------------------------]



ID: 34968557
Title: Active inference leads to Bayesian neurophysiology
Authors: Isomura T.
Year: 2022
Current tags: no tags added


INFO:src.interactive:Currently 2/499 papers tagged.


You are at the first paper in the database. There are no previous papers.


****************************************************************************************************
[>-----------------------------------------------------------------------------------------------------]



ID: 34968557
Title: Active inference leads to Bayesian neurophysiology
Authors: Isomura T.
Year: 2022
Current tags: no tags added


In [33]:
import pandas as pd
pd.DataFrame(tags.tagged_papers)

Unnamed: 0,id,tag
0,31865883,[information geometry]
1,35153603,"[Markov blankets, thermodynamics, sparse coupl..."




In [9]:
tags.tagged_papers

[{'id': 31865883, 'tag': ['information geometry']},
 {'id': 35153603,
  'tag': ['Markov blankets', 'thermodynamics', 'sparse coupling']}]

## Loading the tag file into the database

In [None]:
# Load tag file (move to other notebook)

# Filter by tag