# Project 3 - Article Parsing Toolset

## Team 3 Members

- Matthew Dunbar
- Jeffrei Cher
- Basil James

## Project Problem

To parse article content and extract various data from the article contents

- _**Article Classification into a set of pre-defined categories**_
  (a prediction model)

For Sports Articles:
- _**to identify the sport, and extract game/match scores if present**_
- _(to provide stats summaries if present?)_

## Learning Goal

Develop experience:

- Building and deploying models to leverage in real world applications
- Leveraging custom training of LLMs to provide article analysis.
- Using a tool-based (extensible toolbox) approach to provide multiple analytics features


## Dataset

- https://www.kaggle.com/datasets/fabiochiusano/medium-articles  
[size: 190k+, categories: multiple tags per article, large set, Includes titles, full articles, and URLs]

### Retrieve dataset

In [1]:
! kaggle datasets download -d fabiochiusano/medium-articles -p ./data --unzip

Dataset URL: https://www.kaggle.com/datasets/fabiochiusano/medium-articles
License(s): CC0-1.0
Downloading medium-articles.zip to ./data
100%|█████████████████████████████████████████| 369M/369M [00:03<00:00, 144MB/s]
100%|█████████████████████████████████████████| 369M/369M [00:03<00:00, 119MB/s]


### Local file

In [7]:
!ls -l ./data/

total 1017916
-rw-r--r-- 1 jupyter jupyter 1042340506 Apr  2 17:18 medium_articles.csv


### Fulfill basic Dataframe dependencies

In [72]:
import os

import pandas as pd
from google.cloud import bigquery

import subprocess
import warnings

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
warnings.filterwarnings("ignore")

import json
from pprint import pprint
import markdown
# import tensorflow as tf

print(tf.version.VERSION)

2.12.0


### Load dataframe

In [117]:
# Load dataset (change filename accordingly)
df = pd.read_csv('./data/medium_articles.csv')

# Display first few rows
df.head()

Unnamed: 0,title,text,url,authors,timestamp,tags
0,Mental Note Vol. 24,"Photo by Josh Riemer on Unsplash\n\nMerry Christmas and Happy Holidays, everyone!\n\nWe just wanted everyone to know how much we appreciate everyone and how thankful we are for all our readers and writers here. We wouldn’t be anywhere without you, so thank you all for bringing informative, vulnerable, and important pieces that destigmatize mental illness and mental health.\n\nWithout further ado, here are ten of our top stories from last week, all of which were curated:\n\n“Just as the capacity to love and inspire is universal so is the capacity to hate and discourage. Irrespective of gender, race, age or religion none of us are exempt from aggressive proclivities. Those who are narcissistically disordered, and accordingly repress deep seated feelings of inferiority with inflated delusions of grandeur and superiority, are more prone to aggression and violence. They infiltrate our interactions in myriad environments from home, work, school and the cyber world. Hence, bullying does not happen in isolation. Although there is a ringleader she looks to her minions to either sanction her cruelty or look the other way.”\n\n“Even though the circumstances that brought me here were sad and challenging, I’m grateful for how this program has changed my life for the better. I can’t help but imagine what life would be like if everyone learned to accept their powerlessness over other people, prioritize their serenity, and take life one step at a time. We’ll never know, but I’d bet the world would be much happier.”\n\n“The prospect of spending a horrible Christmas, locked in on a psychiatric unit, was one of the low points of my life. For weeks, the day room was festooned with cheesy decorations and a sorry pink aluminum tree. All of our “activity” therapies revolved around the holidays. We baked and decorated cookies. We fashioned quick-drying clay into ornaments that turned out to be too heavy for the tree. Crappy Christmas carols were background torture. It was hard to get pissed off at the staff because they were making the best with what they had.”\n\n“Although I hate to admit it, even if my ex had never betrayed me, I still wouldn’t have been happy. I had set him up for an impossible job — to define me and make me whole. If I cannot find peace and contentment within myself, how could anyone else do it for me?”\n\n“On a personal note, significant feelings of loss and sadness can still flare up from time to time. That’s only natural; it’s no reason for self-critique. No matter how resilient we purport to be, we are all emotionally vulnerable human beings. Besides, we aren’t talking about some conceptual loss that we can just mechanically compartmentalize away — we are talking about the loss of our fathers, mothers, sisters and brothers.”\n\n“The next six weeks will be hard as cases continue to explode and government leadership remains nonexistent. I can’t control any of this. The only thing I can do is take deep breaths, remain vigilant when it comes to limiting exposure to the virus, and let lots of stuff go. I may always be a hypochondriac, but now that I recognize the beast, I’m hopeful I’ll be able to tame it.”\n\n“From anecdotal news reports and informal surveys, there is evidence that for some of us, this pandemic-imposed isolation is a boon rather than a trial. One study on mixed emotions showed that those with lower emotional stability (“moody” personalities) are actually better at responding to uncertainty.”\n\n“Every day I wish in my heart and soul that I didn’t have ME/CFS. Unfortunately, I do. It’s a result of a virus I had; 10–12 percent of people who experience a serious infection go on to develop ME. I’ve visualized life without CFS for over a year now; I can smell life without it, I can taste it. It’s in the smell of the lavender fields that I can no longer run through. It’s in the taste of the meals from my favorite restaurant that I can no longer walk to. It’s on the tip of my tongue. It’s in the potentialities; all the things I could be doing, as a twenty-four year-old, that I can’t. I cannot cross the chasm between the potential and the reality. And that’s nothing to do with manifestation.”\n\n“Whether it’s cabin fever, redundancy, loss, or general Covid anxieties, this year has caused us to be exposed to more uncertainty than ever. Uncertainty creates unease and feelings of stress. Some of us may have taken this year as one to motivate — plan dream trips, and prepare and be inspired for what the future could bring. For the rest, it has caused us to become irrational, emotional, and reserved.\n\n“To be more self-compassionate is a task that can be tricky because we always want to push ourselves and do better. Without realising it, this can lead to us being self-critical which can have damaging consequences.\n\nIt’s important to notice these times when we are harsh because we can easily turn it into self-compassion, which is linked to a better quality of life.”\n\nMerry Christmas and Happy Holidays, everyone!\n\n— Ryan, Juliette, Marie, and Meredith",https://medium.com/invisible-illness/mental-note-vol-24-969b6a42443f,['Ryan Fan'],2020-12-26 03:38:10.479000+00:00,"['Mental Health', 'Health', 'Psychology', 'Science', 'Neuroscience']"
1,Your Brain On Coronavirus,"Your Brain On Coronavirus\n\nA guide to the curious and troubling impact of the pandemic and isolation\n\nPhoto by cottonbro from Pexels\n\nThe coronavirus pandemic frustrates and confounds epidemiologists and immunologists, even after months of study. It frustrates politicians and public health officials dealing with mask non-compliance. It frustrates everyone stuck at home, whether they lost their job or adapting to Zoom.\n\nAfter exposure to the virus, it first enters the lungs, using host machinery to replicate. The virus itself is just a genetic sequence enclosed in a protein and lipid coat. It binds the ACE2 receptors on lung cells, with a spike protein located on its protein-lipid coat. This receptor, attached to the virus, trafficks into the lung cell. Here the virus hijacks the machinery of the cell to replicate, damaging lung tissue and spreading throughout the body.\n\nThe ACE2 receptor, expressed in many regions of the body, is vulnerable to further entry of these viral particles. The ACE2 receptor regulates blood pressure, nutrient absorption and inflammation. These pathways converge and mediate brain health and disease.\n\nThe novel coronavirus perplexed us for many different reasons. A large majority of people who get it don’t display any symptoms, while some display symptoms for many months and others require ventilators to breathe. It is unclear whether someone infected with coronavirus retains long-term immunity.\n\nAlso, troubling findings implicate this disease in the induction of stroke and the worsening of mental health. The realization that there are likely long-term complications of coronavirus infection is worrying, as millions of people may require expensive coverage for this new pre-existing condition.\n\nThose of us lucky to avoid being infected become more socially isolated and lonely. Many studies report the worsening of mental health symptoms, especially in frontline workers, nurses and doctors. These professionals are more prone to burning out and require extra care.\n\nCOVID-19 and Stroke\n\nThe cells in the brain require a disproportionate amount of energy to function. When deprived of oxygen, even for minutes, the cells begin to die, leading to a variety of debilitating sensory, motor or language deficits depending. When there is blood loss to a specific region of the brain, cells cannot use oxygen to generate energy. If there is a clot in an artery, fresh oxygen cannot travel to any regions primarily supplied by that blood vessel. These events, classified as ischemic strokes, cause lifelong disability in some of those afflicted.\n\nEarly findings in patients found abnormal clotting in blood vessels. Vessels around the lungs or even arterial blood-flow to the brain is interrupted. Thus, individuals infected with coronavirus who suffered abnormal blood clotting as a result, were at higher risk of stroke.\n\nIn June of 2020, researchers published a report of neurological symptoms in the New England Journal of Medicine. While they did not report common symptoms of having a stroke, they showed other strange brain-related features. Of thirteen COVID-19 patients who underwent brain imaging, three of them showed signs of an ischemic stroke. A subset of eight of these patients showed other types of inflammation, while eleven presented with a lack of blood flow to the frontal areas of the brain.\n\nThough a preliminary observational study, it suggested that the coronavirus impacted blood clotting and flow to the brain. Several studies since identified swathes of patients suffering from ischemic strokes or brain/vascular inflammation. Another study reviewed the current state of evidence, concluding that 41% of patients suffering from neurological symptoms after COVID-19 infection, suffered from strokes. Larger studies however, are needed to decipher how common this is among all those infected with the novel coronavirus.\n\nDepending on which region of the brain loses oxygen, stroke may manifest as a broad range of symptoms. If cells die in an area of the brain responsible for motor movement, it later manifests in unilateral or bilateral difficulties with movement. Other common symptoms involve fatigue, challenges with balance or walking, partial paralysis, pain or inattention to one entire side of the body. It prevents individuals from doing the things they do in their daily lives, such as dress themselves or go to the bathroom independently.\n\nCOVID-19 and Psychiatric Disorders\n\nPhoto by Jonathan Rados on Unsplash\n\nEither through neuroimmune signalling or by directly entering the cells of the brain, COVID-19 also contributes to psychiatric symptoms and disorders. It is unclear what role it may play in their pathology, but it may worsen existing conditions or as a contributing factor in its development.\n\nOne study compared individuals afflicted with the novel coronavirus to those in quarantine or the general public, finding elevated rates of depression (29.2%) in those with COVID-19. Another small study reported increased post-traumatic stress symptoms in these patients.\n\nIndividuals already living with psychiatric disorders reported a worsening of symptoms in two different studies. Several other studies reported depressive and anxious symptoms worsened among essential workers.\n\nAnother study surveyed >2000 individuals in Denmark, finding a reduction in overall psychological well-being measures during the pandemics. This study also reported that women were more negatively affected than men.\n\nAdditionally, it recognized that many older adults living in adult-care communities during shelter-in-place orders experience loneliness and depression. A study of older adults in San Francisco found that they showed increased rates of loneliness and depression.\n\nWe must do our best to check-up on our friends and loved ones. We are all affected differently by the pandemic, so it is important to recognize that the rates of anxiety, depression and stress-related disorders may arise.\n\nCOVID-19 Long-Haulers\n\nThousands of individuals initially infected with COVID-19, the long-haulers, continue to suffer symptoms many months later. On average, these individuals are women around the age of 44 who are otherwise healthy. Their infections were classified as mild severity because they could recover at home.\n\nFacing stigma and in need of a community, several groups sprouted up to support each other. Originally disbelieved, they rallied to raise awareness of their predicament within the medical establishment. It should no longer be sufficient to classify individuals infected with COVID-19, who don’t require a hospital stay, as mild.\n\nA few different studies report that most individuals affected with COVID-19 suffer from symptoms months later (Italy, UK, Germany). Intriguingly, many long-haulers did not produce high-levels of coronavirus antibodies. Many individuals experience pain, fatigue and many other debilitating symptoms.\n\nThese symptoms are consistent with disturbances in the autonomic nervous system, which is responsible for many automatic physiological functions like breathing or heart-rate but also influence fatigue. Preliminary physiotherapy involves reconditioning the nervous system of patients so that they may regain some of these functions. In his article, Ed Yong states:",https://medium.com/age-of-awareness/how-the-pandemic-affects-our-brain-and-mental-health-ae2ec0a9fc1d,['Simon Spichak'],2020-09-23 22:10:17.126000+00:00,"['Mental Health', 'Coronavirus', 'Science', 'Psychology', 'Neuroscience']"
2,Mind Your Nose,"Mind Your Nose\n\nHow smell training can change your brain in six weeks — and why it matters.\n\nBy Ann-Sophie Barwich\n\nWhen it comes to training your brain, your sense of smell is possibly the last thing you’d think could strengthen your neural pathways. Learning a new language or reading more books (and fewer social media posts) — sure. But your nose?\n\nThat’s because the olfactory system is one of the most plastic systems in your brain. Neuroplasticity describes how the brain flexibly adapts to changes in the environment or when exposed to neural damage. Stimulating the brain strengthens existing neural structures and further adds fuel to the brain’s capacity to remain adaptive, thereby keeping it young. And your smell system is particularly adept at repair and renewal. (Olfactory cells have recently been used in human transplant therapy to treat spinal cord injury, for example.)\n\nOne reason for the olfactory system’s adaptive responsiveness is that it undergoes adult neurogenesis. Humans grow new olfactory neurons every three to four weeks throughout their entire life, not just during child development. (These sensory neurons sit in the mucous of your nose, where they pick up airborne chemicals and send activity signals straight to the core of the brain.) If it weren’t for this ongoing regeneration of sensory cells in your nose, we would stop detecting smells after our first few colds.\n\nNeural plasticity weakens as we grow old — and so does our sense of smell. Olfactory performance decreases around the age of 70 as the regeneration of olfactory neurons slows down. Yet this process of regeneration never stops entirely. Training your nose helps slow down that decline and offers a great way to increase your brain’s plasticity. That said, increasing your sensitivity to odors in the environment does not always sound desirable. Smell usually comes with negative connotations: that whiff of urine in the metro, that overpowering literal skunk, or that trail of body odor from the person walking in front of you. But paying more attention to the smells around you also has benefits, and not just for a greater enjoyment of food aromas and neighbors’ gardens.\n\nRecent studies show that olfactory abilities correspond with differences in cortical areas involved in smell processing in the brain. Johannes Frasnelli, an olfactory scientist at the University of Quebec in Trois-Rivières, explained: “We did some studies where we saw that there is a link between the structure of certain brain regions-like the thickness of the cortex and the thickness of the gray matter layer in certain brain olfactory processing regions-and the ability to perceive.” Frasnelli and his colleagues found that people with better perceptual capacities had a thicker cortex. When they looked at people who had lost their sense of smell, they also saw a reduction of cortical matter in areas involved in odor processing.\n\nThat raises the question: Could you change the structure of your brain simply by smelling things? In 2019, Frasnelli’s group discovered that undergoing as little as six weeks of intense olfactory training results in significant structural changes in some regions of the brain (namely, the right inferior frontal gyrus, the bilateral fusiform gyrus, and the right entorhinal cortex).\n\nParticipants were given three tasks with a cognitive component.\n\nThe first task was a classification task. Participants had to organize two simple odor mixtures by ordering each from lowest to highest concentration. The second was an identification task. Participants were presented with a target odor blended with a citrus scent in a specific ratio (4%). Then they were given the same blend in different ratios and asked to order them according to quality (more citrusy or less?). Lastly, the detection task: Was the learned target odor present in a range of 14 samples of different odor mixtures or not?\n\nThis entire exercise was undertaken each day for 20 minutes during the six weeks. Responses were monitored and evaluated on speed and accuracy.\n\nSuch intense olfactory training led to a general improvement in olfactory performance. Plus, the increase of olfactory skill was not restricted to the training exercises but also transferred to other olfactory abilities-abilities that had not been tested as part of the training. These perceptual tests included: the detection threshold of an odor, accuracy in odor discrimination (same or different?), cued odor identification (which of these four descriptors is correct?), and even free odor identification (identifying an odor without cues!).\n\nIncreasing insight into what the nose knows, and how it communicates with the brain, has broader implications-even philosophical ones. Old (yet still prevalent) cookie-cutter views of the mind coax us to believe that our senses are passive-indifferently picking up signals in the world that are then processed by the brain. Perception, in such views, is a process separate from cognition. Highly plastic systems such as olfaction present us with a much more intriguing and interwoven picture of the mind: Training your nose’s performance (just like other cognitive capacities) fundamentally shapes what you perceive by rewiring the system.\n\nYour senses are far from being impartial transmitters; what you are able to perceive in the world ultimately hinges on the depth of your cognitive engagement with it. In other words, your mind does not emerge apathetically as a product of some remarkable, intricate molecular twists performed by the brain. The mind is enhanced by what you can train your brain to do. Just like strength is a result of muscle training, cognitive training of the senses is the bodybuilding of the brain.",https://medium.com/neodotlife/mind-your-nose-f0b097d533bb,[],2020-10-10 20:17:37.132000+00:00,"['Biotechnology', 'Neuroscience', 'Brain', 'Wellness', 'Science']"
3,The 4 Purposes of Dreams,Passionate about the synergy between science and technology to provide better care. Check out my newsletter: scienceforreal.substack.com 📰\n\nFollow,https://medium.com/science-for-real/the-4-purposes-of-dreams-fc6719090e75,['Eshan Samaranayake'],2020-12-21 16:05:19.524000+00:00,"['Health', 'Neuroscience', 'Mental Health', 'Psychology', 'Science']"
4,Surviving a Rod Through the Head,"You’ve heard of him, haven’t you? Phineas Gage. The railroad worker who survived an explosion that involved an iron rod piercing through his left cheek and out of his brain and skull.\n\nYeah.\n\nI know.\n\nYou’re probably wondering “yeah, alright sweet. What about him?” Well, let’s just say that he was a really popular patient for the field of neuroscience (Cherry, par. 1). And what I found the most interesting about this tragic event was the science of his behavior afterward.\n\nFor those of you who don’t know much about Phineas Gage, let me fill you in with the help of my research.\n\nPhineas Gage, 25 years old, was a railroad worker in Vermont. One day, at work, he was using an iron rod to handle explosive gun powder. As he was using the iron rod to handle the gun powder, an explosion suddenly occurred. The iron rod then went through his left cheek and brain. Fortunately, he survived and was able to talk and walk after the accident (Cherry, par. 2–3).\n\nWhy did people say that Phineas Gage was a “different person” after his accident? It actually has to do with neuroscience.\n\nThe iron rod went through his brain, in particular, it went through the frontal lobe of his brain. Does this mean that the frontal lobe of your brain has to do with the kind of person you are? To answer this question, we have to understand what the frontal lobe in our brain is responsible for.\n\nOur frontal lobes are responsible for many things. Some of them are higher-order thinking, personality, and decision making. This explains why people who knew Phineas Gage said that he was a totally different person after the accident. Since the iron rod went through his frontal lobe, it means that his personality and thinking, as a whole, completely changed, making him seem like he was a whole different person due to the way he started acting.\n\nThis accident and the treatment of Phineas Gage actually played a big role in the field of neurology. His case helped scientists better understand the role of the frontal cortex of the brain (Cherry, par. 16–17).\n\nBibliography\n\nCherry, Kendra. “The Famous Case of Phineas Gage’s Astonishing Brain Injury.” Phineas Gage’s Astonishing Brain Injury, Verywell Mind, 3 Oct. 2019, www.verywellmind.com/phineas-gage-2795244#targetText=The%20rod%20penetrated%20Gage's%20left,be%20seen%20by%20a%20doctor.",https://medium.com/live-your-life-on-purpose/surviving-a-rod-through-the-head-2e5d74db978,['Rishav Sinha'],2020-02-26 00:01:01.576000+00:00,"['Brain', 'Health', 'Development', 'Psychology', 'Science']"


### Parse DataFrame to remove unwanted columns

In [66]:
# parse the CSV data
CSV_COLUMNS = [
    "title",
    "text",
    "url",
    "authors",
    "timestamp",
    "tags",
]
LABEL_COLUMN = "text"
DEFAULTS = [["na"], ["na"], ["na"], ["na"], ["na"], ["na"]]
UNWANTED_COLS = ["title", "url", "authors", "timestamp"]

DESIRED_COLUMNS = [col for col in CSV_COLUMNS if col not in UNWANTED_COLS]
df = df[DESIRED_COLUMNS]

# Show full column width and prevent truncation
pd.set_option("display.max_colwidth", None)  # Show full text
pd.set_option("display.expand_frame_repr", False)  # Prevent wrapping

df.head()

Unnamed: 0,text,tags
0,"Photo by Josh Riemer on Unsplash\n\nMerry Christmas and Happy Holidays, everyone!\n\nWe just wanted everyone to know how much we appreciate everyone and how thankful we are for all our readers and writers here. We wouldn’t be anywhere without you, so thank you all for bringing informative, vulnerable, and important pieces that destigmatize mental illness and mental health.\n\nWithout further ado, here are ten of our top stories from last week, all of which were curated:\n\n“Just as the capacity to love and inspire is universal so is the capacity to hate and discourage. Irrespective of gender, race, age or religion none of us are exempt from aggressive proclivities. Those who are narcissistically disordered, and accordingly repress deep seated feelings of inferiority with inflated delusions of grandeur and superiority, are more prone to aggression and violence. They infiltrate our interactions in myriad environments from home, work, school and the cyber world. Hence, bullying does not happen in isolation. Although there is a ringleader she looks to her minions to either sanction her cruelty or look the other way.”\n\n“Even though the circumstances that brought me here were sad and challenging, I’m grateful for how this program has changed my life for the better. I can’t help but imagine what life would be like if everyone learned to accept their powerlessness over other people, prioritize their serenity, and take life one step at a time. We’ll never know, but I’d bet the world would be much happier.”\n\n“The prospect of spending a horrible Christmas, locked in on a psychiatric unit, was one of the low points of my life. For weeks, the day room was festooned with cheesy decorations and a sorry pink aluminum tree. All of our “activity” therapies revolved around the holidays. We baked and decorated cookies. We fashioned quick-drying clay into ornaments that turned out to be too heavy for the tree. Crappy Christmas carols were background torture. It was hard to get pissed off at the staff because they were making the best with what they had.”\n\n“Although I hate to admit it, even if my ex had never betrayed me, I still wouldn’t have been happy. I had set him up for an impossible job — to define me and make me whole. If I cannot find peace and contentment within myself, how could anyone else do it for me?”\n\n“On a personal note, significant feelings of loss and sadness can still flare up from time to time. That’s only natural; it’s no reason for self-critique. No matter how resilient we purport to be, we are all emotionally vulnerable human beings. Besides, we aren’t talking about some conceptual loss that we can just mechanically compartmentalize away — we are talking about the loss of our fathers, mothers, sisters and brothers.”\n\n“The next six weeks will be hard as cases continue to explode and government leadership remains nonexistent. I can’t control any of this. The only thing I can do is take deep breaths, remain vigilant when it comes to limiting exposure to the virus, and let lots of stuff go. I may always be a hypochondriac, but now that I recognize the beast, I’m hopeful I’ll be able to tame it.”\n\n“From anecdotal news reports and informal surveys, there is evidence that for some of us, this pandemic-imposed isolation is a boon rather than a trial. One study on mixed emotions showed that those with lower emotional stability (“moody” personalities) are actually better at responding to uncertainty.”\n\n“Every day I wish in my heart and soul that I didn’t have ME/CFS. Unfortunately, I do. It’s a result of a virus I had; 10–12 percent of people who experience a serious infection go on to develop ME. I’ve visualized life without CFS for over a year now; I can smell life without it, I can taste it. It’s in the smell of the lavender fields that I can no longer run through. It’s in the taste of the meals from my favorite restaurant that I can no longer walk to. It’s on the tip of my tongue. It’s in the potentialities; all the things I could be doing, as a twenty-four year-old, that I can’t. I cannot cross the chasm between the potential and the reality. And that’s nothing to do with manifestation.”\n\n“Whether it’s cabin fever, redundancy, loss, or general Covid anxieties, this year has caused us to be exposed to more uncertainty than ever. Uncertainty creates unease and feelings of stress. Some of us may have taken this year as one to motivate — plan dream trips, and prepare and be inspired for what the future could bring. For the rest, it has caused us to become irrational, emotional, and reserved.\n\n“To be more self-compassionate is a task that can be tricky because we always want to push ourselves and do better. Without realising it, this can lead to us being self-critical which can have damaging consequences.\n\nIt’s important to notice these times when we are harsh because we can easily turn it into self-compassion, which is linked to a better quality of life.”\n\nMerry Christmas and Happy Holidays, everyone!\n\n— Ryan, Juliette, Marie, and Meredith","['Mental Health', 'Health', 'Psychology', 'Science', 'Neuroscience']"
1,"Your Brain On Coronavirus\n\nA guide to the curious and troubling impact of the pandemic and isolation\n\nPhoto by cottonbro from Pexels\n\nThe coronavirus pandemic frustrates and confounds epidemiologists and immunologists, even after months of study. It frustrates politicians and public health officials dealing with mask non-compliance. It frustrates everyone stuck at home, whether they lost their job or adapting to Zoom.\n\nAfter exposure to the virus, it first enters the lungs, using host machinery to replicate. The virus itself is just a genetic sequence enclosed in a protein and lipid coat. It binds the ACE2 receptors on lung cells, with a spike protein located on its protein-lipid coat. This receptor, attached to the virus, trafficks into the lung cell. Here the virus hijacks the machinery of the cell to replicate, damaging lung tissue and spreading throughout the body.\n\nThe ACE2 receptor, expressed in many regions of the body, is vulnerable to further entry of these viral particles. The ACE2 receptor regulates blood pressure, nutrient absorption and inflammation. These pathways converge and mediate brain health and disease.\n\nThe novel coronavirus perplexed us for many different reasons. A large majority of people who get it don’t display any symptoms, while some display symptoms for many months and others require ventilators to breathe. It is unclear whether someone infected with coronavirus retains long-term immunity.\n\nAlso, troubling findings implicate this disease in the induction of stroke and the worsening of mental health. The realization that there are likely long-term complications of coronavirus infection is worrying, as millions of people may require expensive coverage for this new pre-existing condition.\n\nThose of us lucky to avoid being infected become more socially isolated and lonely. Many studies report the worsening of mental health symptoms, especially in frontline workers, nurses and doctors. These professionals are more prone to burning out and require extra care.\n\nCOVID-19 and Stroke\n\nThe cells in the brain require a disproportionate amount of energy to function. When deprived of oxygen, even for minutes, the cells begin to die, leading to a variety of debilitating sensory, motor or language deficits depending. When there is blood loss to a specific region of the brain, cells cannot use oxygen to generate energy. If there is a clot in an artery, fresh oxygen cannot travel to any regions primarily supplied by that blood vessel. These events, classified as ischemic strokes, cause lifelong disability in some of those afflicted.\n\nEarly findings in patients found abnormal clotting in blood vessels. Vessels around the lungs or even arterial blood-flow to the brain is interrupted. Thus, individuals infected with coronavirus who suffered abnormal blood clotting as a result, were at higher risk of stroke.\n\nIn June of 2020, researchers published a report of neurological symptoms in the New England Journal of Medicine. While they did not report common symptoms of having a stroke, they showed other strange brain-related features. Of thirteen COVID-19 patients who underwent brain imaging, three of them showed signs of an ischemic stroke. A subset of eight of these patients showed other types of inflammation, while eleven presented with a lack of blood flow to the frontal areas of the brain.\n\nThough a preliminary observational study, it suggested that the coronavirus impacted blood clotting and flow to the brain. Several studies since identified swathes of patients suffering from ischemic strokes or brain/vascular inflammation. Another study reviewed the current state of evidence, concluding that 41% of patients suffering from neurological symptoms after COVID-19 infection, suffered from strokes. Larger studies however, are needed to decipher how common this is among all those infected with the novel coronavirus.\n\nDepending on which region of the brain loses oxygen, stroke may manifest as a broad range of symptoms. If cells die in an area of the brain responsible for motor movement, it later manifests in unilateral or bilateral difficulties with movement. Other common symptoms involve fatigue, challenges with balance or walking, partial paralysis, pain or inattention to one entire side of the body. It prevents individuals from doing the things they do in their daily lives, such as dress themselves or go to the bathroom independently.\n\nCOVID-19 and Psychiatric Disorders\n\nPhoto by Jonathan Rados on Unsplash\n\nEither through neuroimmune signalling or by directly entering the cells of the brain, COVID-19 also contributes to psychiatric symptoms and disorders. It is unclear what role it may play in their pathology, but it may worsen existing conditions or as a contributing factor in its development.\n\nOne study compared individuals afflicted with the novel coronavirus to those in quarantine or the general public, finding elevated rates of depression (29.2%) in those with COVID-19. Another small study reported increased post-traumatic stress symptoms in these patients.\n\nIndividuals already living with psychiatric disorders reported a worsening of symptoms in two different studies. Several other studies reported depressive and anxious symptoms worsened among essential workers.\n\nAnother study surveyed >2000 individuals in Denmark, finding a reduction in overall psychological well-being measures during the pandemics. This study also reported that women were more negatively affected than men.\n\nAdditionally, it recognized that many older adults living in adult-care communities during shelter-in-place orders experience loneliness and depression. A study of older adults in San Francisco found that they showed increased rates of loneliness and depression.\n\nWe must do our best to check-up on our friends and loved ones. We are all affected differently by the pandemic, so it is important to recognize that the rates of anxiety, depression and stress-related disorders may arise.\n\nCOVID-19 Long-Haulers\n\nThousands of individuals initially infected with COVID-19, the long-haulers, continue to suffer symptoms many months later. On average, these individuals are women around the age of 44 who are otherwise healthy. Their infections were classified as mild severity because they could recover at home.\n\nFacing stigma and in need of a community, several groups sprouted up to support each other. Originally disbelieved, they rallied to raise awareness of their predicament within the medical establishment. It should no longer be sufficient to classify individuals infected with COVID-19, who don’t require a hospital stay, as mild.\n\nA few different studies report that most individuals affected with COVID-19 suffer from symptoms months later (Italy, UK, Germany). Intriguingly, many long-haulers did not produce high-levels of coronavirus antibodies. Many individuals experience pain, fatigue and many other debilitating symptoms.\n\nThese symptoms are consistent with disturbances in the autonomic nervous system, which is responsible for many automatic physiological functions like breathing or heart-rate but also influence fatigue. Preliminary physiotherapy involves reconditioning the nervous system of patients so that they may regain some of these functions. In his article, Ed Yong states:","['Mental Health', 'Coronavirus', 'Science', 'Psychology', 'Neuroscience']"
2,"Mind Your Nose\n\nHow smell training can change your brain in six weeks — and why it matters.\n\nBy Ann-Sophie Barwich\n\nWhen it comes to training your brain, your sense of smell is possibly the last thing you’d think could strengthen your neural pathways. Learning a new language or reading more books (and fewer social media posts) — sure. But your nose?\n\nThat’s because the olfactory system is one of the most plastic systems in your brain. Neuroplasticity describes how the brain flexibly adapts to changes in the environment or when exposed to neural damage. Stimulating the brain strengthens existing neural structures and further adds fuel to the brain’s capacity to remain adaptive, thereby keeping it young. And your smell system is particularly adept at repair and renewal. (Olfactory cells have recently been used in human transplant therapy to treat spinal cord injury, for example.)\n\nOne reason for the olfactory system’s adaptive responsiveness is that it undergoes adult neurogenesis. Humans grow new olfactory neurons every three to four weeks throughout their entire life, not just during child development. (These sensory neurons sit in the mucous of your nose, where they pick up airborne chemicals and send activity signals straight to the core of the brain.) If it weren’t for this ongoing regeneration of sensory cells in your nose, we would stop detecting smells after our first few colds.\n\nNeural plasticity weakens as we grow old — and so does our sense of smell. Olfactory performance decreases around the age of 70 as the regeneration of olfactory neurons slows down. Yet this process of regeneration never stops entirely. Training your nose helps slow down that decline and offers a great way to increase your brain’s plasticity. That said, increasing your sensitivity to odors in the environment does not always sound desirable. Smell usually comes with negative connotations: that whiff of urine in the metro, that overpowering literal skunk, or that trail of body odor from the person walking in front of you. But paying more attention to the smells around you also has benefits, and not just for a greater enjoyment of food aromas and neighbors’ gardens.\n\nRecent studies show that olfactory abilities correspond with differences in cortical areas involved in smell processing in the brain. Johannes Frasnelli, an olfactory scientist at the University of Quebec in Trois-Rivières, explained: “We did some studies where we saw that there is a link between the structure of certain brain regions-like the thickness of the cortex and the thickness of the gray matter layer in certain brain olfactory processing regions-and the ability to perceive.” Frasnelli and his colleagues found that people with better perceptual capacities had a thicker cortex. When they looked at people who had lost their sense of smell, they also saw a reduction of cortical matter in areas involved in odor processing.\n\nThat raises the question: Could you change the structure of your brain simply by smelling things? In 2019, Frasnelli’s group discovered that undergoing as little as six weeks of intense olfactory training results in significant structural changes in some regions of the brain (namely, the right inferior frontal gyrus, the bilateral fusiform gyrus, and the right entorhinal cortex).\n\nParticipants were given three tasks with a cognitive component.\n\nThe first task was a classification task. Participants had to organize two simple odor mixtures by ordering each from lowest to highest concentration. The second was an identification task. Participants were presented with a target odor blended with a citrus scent in a specific ratio (4%). Then they were given the same blend in different ratios and asked to order them according to quality (more citrusy or less?). Lastly, the detection task: Was the learned target odor present in a range of 14 samples of different odor mixtures or not?\n\nThis entire exercise was undertaken each day for 20 minutes during the six weeks. Responses were monitored and evaluated on speed and accuracy.\n\nSuch intense olfactory training led to a general improvement in olfactory performance. Plus, the increase of olfactory skill was not restricted to the training exercises but also transferred to other olfactory abilities-abilities that had not been tested as part of the training. These perceptual tests included: the detection threshold of an odor, accuracy in odor discrimination (same or different?), cued odor identification (which of these four descriptors is correct?), and even free odor identification (identifying an odor without cues!).\n\nIncreasing insight into what the nose knows, and how it communicates with the brain, has broader implications-even philosophical ones. Old (yet still prevalent) cookie-cutter views of the mind coax us to believe that our senses are passive-indifferently picking up signals in the world that are then processed by the brain. Perception, in such views, is a process separate from cognition. Highly plastic systems such as olfaction present us with a much more intriguing and interwoven picture of the mind: Training your nose’s performance (just like other cognitive capacities) fundamentally shapes what you perceive by rewiring the system.\n\nYour senses are far from being impartial transmitters; what you are able to perceive in the world ultimately hinges on the depth of your cognitive engagement with it. In other words, your mind does not emerge apathetically as a product of some remarkable, intricate molecular twists performed by the brain. The mind is enhanced by what you can train your brain to do. Just like strength is a result of muscle training, cognitive training of the senses is the bodybuilding of the brain.","['Biotechnology', 'Neuroscience', 'Brain', 'Wellness', 'Science']"
3,Passionate about the synergy between science and technology to provide better care. Check out my newsletter: scienceforreal.substack.com 📰\n\nFollow,"['Health', 'Neuroscience', 'Mental Health', 'Psychology', 'Science']"
4,"You’ve heard of him, haven’t you? Phineas Gage. The railroad worker who survived an explosion that involved an iron rod piercing through his left cheek and out of his brain and skull.\n\nYeah.\n\nI know.\n\nYou’re probably wondering “yeah, alright sweet. What about him?” Well, let’s just say that he was a really popular patient for the field of neuroscience (Cherry, par. 1). And what I found the most interesting about this tragic event was the science of his behavior afterward.\n\nFor those of you who don’t know much about Phineas Gage, let me fill you in with the help of my research.\n\nPhineas Gage, 25 years old, was a railroad worker in Vermont. One day, at work, he was using an iron rod to handle explosive gun powder. As he was using the iron rod to handle the gun powder, an explosion suddenly occurred. The iron rod then went through his left cheek and brain. Fortunately, he survived and was able to talk and walk after the accident (Cherry, par. 2–3).\n\nWhy did people say that Phineas Gage was a “different person” after his accident? It actually has to do with neuroscience.\n\nThe iron rod went through his brain, in particular, it went through the frontal lobe of his brain. Does this mean that the frontal lobe of your brain has to do with the kind of person you are? To answer this question, we have to understand what the frontal lobe in our brain is responsible for.\n\nOur frontal lobes are responsible for many things. Some of them are higher-order thinking, personality, and decision making. This explains why people who knew Phineas Gage said that he was a totally different person after the accident. Since the iron rod went through his frontal lobe, it means that his personality and thinking, as a whole, completely changed, making him seem like he was a whole different person due to the way he started acting.\n\nThis accident and the treatment of Phineas Gage actually played a big role in the field of neurology. His case helped scientists better understand the role of the frontal cortex of the brain (Cherry, par. 16–17).\n\nBibliography\n\nCherry, Kendra. “The Famous Case of Phineas Gage’s Astonishing Brain Injury.” Phineas Gage’s Astonishing Brain Injury, Verywell Mind, 3 Oct. 2019, www.verywellmind.com/phineas-gage-2795244#targetText=The%20rod%20penetrated%20Gage's%20left,be%20seen%20by%20a%20doctor.","['Brain', 'Health', 'Development', 'Psychology', 'Science']"


### Check formatting within text column

In [69]:
print(df["text"].iloc[4])  # Display only the 'text' column

You’ve heard of him, haven’t you? Phineas Gage. The railroad worker who survived an explosion that involved an iron rod piercing through his left cheek and out of his brain and skull.

Yeah.

I know.

You’re probably wondering “yeah, alright sweet. What about him?” Well, let’s just say that he was a really popular patient for the field of neuroscience (Cherry, par. 1). And what I found the most interesting about this tragic event was the science of his behavior afterward.

For those of you who don’t know much about Phineas Gage, let me fill you in with the help of my research.

Phineas Gage, 25 years old, was a railroad worker in Vermont. One day, at work, he was using an iron rod to handle explosive gun powder. As he was using the iron rod to handle the gun powder, an explosion suddenly occurred. The iron rod then went through his left cheek and brain. Fortunately, he survived and was able to talk and walk after the accident (Cherry, par. 2–3).

Why did people say that Phineas Gage wa

### Parse Data Sets from DataFrame

In [75]:
from sklearn.model_selection import train_test_split

# Split into training and temp (validation + test) set (70% train, 30% temp)
train, temp = train_test_split(df, test_size=0.3, random_state=42)

# Split the temp set into validation and test set (50% validation, 50% test of the 30%)
validate, test = train_test_split(temp, test_size=0.5, random_state=42)

# Print counts of rows in each set
print(f"Training Set Size: {train.shape[0]}")
print(f"Validation Set Size: {validate.shape[0]}")
print(f"Test Set Size: {test.shape[0]}")

Training Set Size: 134657
Validation Set Size: 28855
Test Set Size: 28856


## Solution

### Approach

LLM-based tool(s) with tool-specific training, with tool-specific engineered prompt(s).
Modular. Function-based.  Extensible.


#### Input: content of the article to analyse

(ideally, the contents directly.  alternate consideration might be to provide a URL, but that would require additional python supporting fucntions to pull, then clean up the contents prior to submission)

#### Output: variable per tool/function

### Instantiate an LLM, and set it up to support one or more functions

In [78]:
# Instantiate an LLM
from typing import Any, Callable, Optional, Tuple, Union

from google import genai
from google.cloud import bigquery
from google.genai.types import (
    FunctionDeclaration,
    GenerateContentConfig,
    GenerateContentResponse,
    Part,
    Schema,
    Tool,
)
from IPython.display import Markdown

REGION = "us-central1"
PROJECT = !(gcloud config get-value core/project)
PROJECT = PROJECT[0]

MODEL = "gemini-2.0-flash-001"

client = genai.Client(vertexai=True, location="us-central1")

# Define the ChatAgent parent class
class ChatAgent:
    def __init__(
        self,
        tools: list[Tool],
        tool_handler_fn: Callable[[str, dict], Any],
        max_iterative_calls: int = 5,
    ):
        self.tools = tools
        self.tool_handler_fn = tool_handler_fn
        self.chat_session = chat = client.chats.create(
            model=MODEL,
            config=GenerateContentConfig(tools=tools),
        )
        self.max_iterative_calls = 5

    def send_message(self, message: str) -> GenerateContentResponse:
        response = self.chat_session.send_message(message)
        # This is None if a function call was not triggered
        fn_calls = response.function_calls

        num_calls = 0
        # Reasoning loop. If fn_calls is empty then we never enter this
        # and simply return the response
        while fn_calls:
            if num_calls > self.max_iterative_calls:
                break

            # Handle the function calls
            fn_call_responses = []
            for fn_call in fn_calls:
                response = self.tool_handler_fn(
                    fn_call.name, dict(fn_call.args)
                )
                fn_call_responses.append(
                    Part.from_function_response(
                        name=fn_call.name,
                        response={
                            "content": response,
                        },
                    ),
                )
                num_calls += 1

            # Send the function call result back to the model
            response = self.chat_session.send_message(fn_call_responses)

            # If the response is another function call then we want to
            # stay in the reasoning loop and keep calling functions.
            fn_calls = response.function_calls

        return response

### Tools

#### Article Topic Classifier

input: article body  
output: one of a curated set of topic categories, based upon highest probability match

##### Create Custom Query and Response Templates

###### LLM Input

In [177]:
# Define the function declaration for the topic classifier
from pydantic import BaseModel, Extra, Field
from typing import Dict, Any
from enum import Enum

# Define an Enum for topics
class TopicEnum(str, Enum):
    arts = "arts"
    business = "business"
    entertainment = "entertainment"
    culture = "culture"
    literature = "literature"
    medicine = "medicine"
    music = "music"
    personal_development = "personal development"
    philosophy = "philosophy"
    politics = "politics"
    religion = "religion"
    science = "science"
    sports = "sports"
    technology = "technology"
    us_news = "us news"
    world = "world"

    # Use a @classmethod to safely store descriptions
    @classmethod
    def _get_descriptions(cls):
        return {
            "arts": "Cultural and creative activities, including fine arts, theater, and music.",
            "business": "The activities related to the production, distribution, and sale of goods and services.",
            "entertainment": "Media, performance, and activities designed to entertain an audience.",
            "culture": "The shared customs, arts, and social institutions of a particular group of people.",
            "literature": "Written works, books, essays, poems, et cetera, especially those considered of superior or lasting artistic merit.",
            "medicine": "The field of health and healing, including clinical practices and healthcare.",
            "music": "An art form that uses sound and rhythm to express emotions, ideas, and cultural identity.",
            "personal_development": "Activities and practices that improve awareness and identity, develop talents, and enhance the quality of life.",
            "philosophy": "The study of fundamental questions regarding existence, knowledge, ethics, reason, and the mind.",
            "politics": "The activities associated with governance, policy, and political ideologies.",
            "religion": "The system of beliefs, practices, and worship regarding a deity or deities.",
            "science": "Systematic enterprise that builds and organizes knowledge through testable explanations and predictions.",
            "sports": "Physical activities involving skill, competition, and fitness.",
            "technology": "The application of scientific knowledge for practical purposes, particularly in industry.",
            "us_news": "News related to events, politics, and issues within the United States.",
            "world": "Global news, issues, and events occurring internationally."
        }

    @classmethod
    def get_description(cls, topic: "TopicEnum") -> str:
        """Fetches the description for a given topic using the class-level _descriptions dictionary."""
        descriptions = cls._get_descriptions()  # Safely fetch the descriptions
        # Convert spaces in topic.value to underscores to match the dictionary keys
        key = topic.value.replace(" ", "_")
        if key not in descriptions:
            raise ValueError(f"No description found for topic: {topic.value}")
        return descriptions[key]

class Schema(BaseModel):
    type: str
    properties: Dict[str, Any]
    required: list

class FunctionDeclarationWithExtra(BaseModel):
    name: str
    description: str
    parameters: Schema
    # Allow extra fields (like PROMPT)
    class Config:
        extra = Extra.allow
        
topic_list = "\n".join([f"{topic.value.capitalize()}: {TopicEnum.get_description(topic)}" for topic in TopicEnum])

# Define the function declaration for the topic classifier
topic_classifier_tool_handler_fn = FunctionDeclarationWithExtra(
    name="topic_classifier",
    description="Identify the article topic by analysing the contents of the article",
    parameters=Schema(
        type="OBJECT",
        properties={
            "article_text": {  # Define the expected 'article_text' input
                "type": "STRING",
                "description": "The content of the article to classify."
            },
            "topic": {
                "type": "STRING",
                "description": "Topic",
                "enum": [topic.value for topic in TopicEnum]
            },
        },
        required=["article_text", "topic"],
    ),
    PROMPT="""
    Identify the most relevant topic for the following article:

    {article_text}
    
    Review the enumerated topic categories and their descriptions categoried based upon both the topics and descriptions.
    If an areticle IS a poem, the topic is 'literature'.
    Only return one of the enumerated topics:
    
    {topic_list}
    
    """
)

# Function to classify article text based on topic classification
def classify_article_and_get_tags(article_text: str):
    # Dynamically set the prompt with the actual article text
    formatted_prompt = topic_classifier_tool_handler_fn.PROMPT.format(
        article_text=article_text,
        topic_list=topic_list
    )
    
    # Send the article text to the model for classification
    response = client.models.generate_content(
        model=MODEL,
        contents=formatted_prompt,
        config=GenerateContentConfig(
            response_mime_type="text/x.enum",
            response_schema={
                "type": "STRING",
                "enum": [topic.value for topic in TopicEnum],
            },
        ),
    )
    
    # The response contains the classification result
    classification_result = response.text.strip()  # Clean up the result
    
    return classification_result

# Now, assuming we have a dataframe or list of articles to process
article_data = [
    "You’ve heard of him, haven’t you? Phineas Gage. The railroad worker who survived an explosion...",
    "Another article text that talks about politics and economy...",
    # Add more articles as needed
]

## Generate the list
# print(topic_list)
# print

# Process each article and classify its topic
for idx, row in test.head(10).iterrows():
    article_text = row["text"]
    predicted_topic = classify_article_and_get_tags(article_text)
    actual_tags = row["tags"]  # Get the tags for the current row
    print(f"{article_text}\n")
    print("---------------------------------------------------------------------------------\n")
    print(f"Predicted Article Topic: {predicted_topic}\t\tDataset tags: {actual_tags}\n")
    print("---------------------------------------------------------------------------------\n\n")

Lloyd Austin, Secretary of Defense

While Gen. Lloyd Austin would be the first Black Defense Secretary, his terrible record should overshadow this feat, but we know that is going to be brought up in mainstream media. As Gen. Austin was tapped by Obama to begin the process of the withdrawal of American troops from Iraq in 2010, it was an unorganized effort leaving a depleted Iraqi government vulnerable with a military not trained or equipped to quell the rise of ISIS. The disaster did not stop there as Gen. Austin was then given the responsibility of overseeing a Syrian rebel program to combat ISIS that cost us $384 million, which ended in failure after the millions of dollars earmarked to build the training camps were rarely if ever used.

Shortly after all of these debacles, Gen. Austin sold out to defense contractors and wealthy investment funds becoming a board member at Raytheon and a partner at Pine Island Capital Partners, who both stand to profit immensely from a Biden administr

#### Sport and Score Extractor

input: article body  
output: identified sport, score summary

In [None]:

# Import necessary modules from pydantic
from pydantic import BaseModel, Extra
from typing import Dict, Any
import pandas as pd  # Assuming you're using pandas for the test dataframe

# Define the Schema class
class Schema(BaseModel):
    type: str
    properties: Dict[str, Any]
    required: list

# Define the FunctionDeclarationWithExtra class
class FunctionDeclarationWithExtra(BaseModel):
    name: str
    description: str
    parameters: Schema
    # Allow extra fields (like PROMPT)
    class Config:
        extra = Extra.allow

# Define the function declaration for score extraction
score_extraction_func = FunctionDeclarationWithExtra(
    name="score_extraction",
    description="Extract the scores from the article if feasible",
    parameters=Schema(
        type="OBJECT",
        properties={
            "article_text": {  # Define the expected 'article_text' input
                "type": "STRING",
                "description": "The content of the article to extract scores."
            },
            "score_available": {
                "type": "BOOLEAN",
                "description": "Indicates if scores are available in the article."
            },
            "team1_score": {
                "type": "INTEGER",
                "description": "Score for team 1 if available."
            },
            "team2_score": {
                "type": "INTEGER",
                "description": "Score for team 2 if available."
            },
            "team1_name": {
                "type": "STRING",
                "description": "Name of team 1 if mentioned."
            },
            "team2_name": {
                "type": "STRING",
                "description": "Name of team 2 if mentioned."
            },
        },
        required=["article_text", "score_available"],
    ),
    PROMPT="""
    Given the following article, extract the score if feasible:

    {article_text}

    Provide the output in a structured JSON format:
    {{
        "score_available": true/false,
        "team1_name": "name_of_team1",
        "team1_score": score_of_team1,
        "team2_name": "name_of_team2",
        "team2_score": score_of_team2
    }}
    """
)

def extract_scores_from_article(article_text: str):
    # Dynamically set the prompt with the actual article text
    formatted_prompt = score_extraction_func.PROMPT.format(article_text=article_text)

    # Example client and content generation logic (ensure the client and model are correctly defined)
    response = client.models.generate_content(
        model=MODEL,
        contents=formatted_prompt,
        config=GenerateContentConfig(
            response_mime_type="application/json",
            response_schema={
                "type": "OBJECT",
                "properties": {
                    "score_available": {
                        "type": "BOOLEAN",
                    },
                    "team1_name": {
                        "type": "STRING",
                    },
                    "team1_score": {
                        "type": "INTEGER",
                    },
                    "team2_name": {
                        "type": "STRING",
                    },
                    "team2_score": {
                        "type": "INTEGER",
                    }
                },
                "required": ["score_available"],
            },
        ),
    )

    # The response contains the scores extracted from the article
    scores_extraction_result = json.loads(response.json())  # Parse the JSON response

    return scores_extraction_result

# Loop through the test dataframe and extract scores
for idx, row in test.head(2000).iterrows():  # Adjust the dataframe as necessary
    article_text = row["text"]
    extracted_scores = extract_scores_from_article(article_text)
    actual_tags = row["tags"]  # Get the tags for the current row

    if extracted_scores.get("score_available"):
        print(f"Extracted Scores: {extracted_scores}\t\tActual tags: {actual_tags}")

#### Sport Statistics Summarizer

input: article body  
output: statistics summary

In [None]:

# Import necessary modules from pydantic
from pydantic import BaseModel, Extra
from typing import Dict, Any
import pandas as pd  # Assuming you're using pandas for the test dataframe
import json

# Define the Schema class
class Schema(BaseModel):
    type: str
    properties: Dict[str, Any]
    required: list

# Define the FunctionDeclarationWithExtra class
class FunctionDeclarationWithExtra(BaseModel):
    name: str
    description: str
    parameters: Schema
    # Allow extra fields (like PROMPT)
    class Config:
        extra = Extra.allow

# Define the function declaration for sport mention and person summary extraction
sport_person_extraction_func = FunctionDeclarationWithExtra(
    name="sport_person_extraction",
    description="Extract mentions of sports and summarize information related to individuals mentioned in the article if feasible",
    parameters=Schema(
        type="OBJECT",
        properties={
            "article_text": {  # Define the expected 'article_text' input
                "type": "STRING",
                "description": "The content of the article to extract sport mentions and person summaries."
            },
            "sport_mentioned": {
                "type": "BOOLEAN",
                "description": "Indicates if any sports are mentioned in the article."
            },
            "person_name": {
                "type": "STRING",
                "description": "Name of the person mentioned in relation to the sport."
            },
            "summary": {
                "type": "STRING",
                "description": "Summary of what is said about the person."
            },
        },
        required=["article_text", "sport_mentioned"],
    ),
    PROMPT="""
    Given the following article, identify if any sports are mentioned. If there are mentions of sports, find any names attached to the article and provide a summary about what the article is saying about that person.

    {article_text}

    Provide the output in a structured JSON format:
    {{
        "sport_mentioned": true/false,
        "person_name": "name_of_person_mentioned",
        "summary": "summary_about_person"
    }}
    """
)

def extract_sport_person_summary_from_article(article_text: str):
    # Dynamically set the prompt with the actual article text
    formatted_prompt = sport_person_extraction_func.PROMPT.format(article_text=article_text)

    # Example client and content generation logic (ensure the client and model are correctly defined)
    response = client.models.generate_content(
        model=MODEL,  # Make sure to replace with the actual model name you are using
        contents=formatted_prompt,
        config=GenerateContentConfig(
            response_mime_type="application/json",
            response_schema={
                "type": "OBJECT",
                "properties": {
                    "sport_mentioned": {
                        "type": "BOOLEAN",
                    },
                    "person_name": {
                        "type": "STRING",
                    },
                    "summary": {
                        "type": "STRING",
                    }
                },
                "required": ["sport_mentioned"],
            },
        ),
    )

    # Parse the JSON response
    response_data = json.loads(response.json())
    sport_person_extraction_result = response_data.get('parsed', {})

    return sport_person_extraction_result

# Loop through the test dataframe and extract sport mentions and person summaries
for idx, row in test.head(2000).iterrows():
    article_text = row["text"]
    extracted_info = extract_sport_person_summary_from_article(article_text)
    actual_tags = row["tags"]  # Get the tags for the current row

    if extracted_info.get("sport_mentioned") == True:
        print(f"Extracted Info: {extracted_info}\t\tActual tags: {actual_tags}")

### API Deployment (Input/Output)

Webapp endpoint

#### Support Pipeline

##### Dockerize LLM

##### Stand up Vertex Endpoint

## Reference Labs

- Gemini Function Calling
- Gemini Prompt Engineering
- AutoML for Text Classification - Vertex
- KFP Walkthrogh - Vertex Containerization - Training and Deployment Pipelines