# DIGI405 Lab Class 11: Sentiment Analysis

This week’s class will investigate lexicon-based sentiment analysis with Vader (‘Valence Aware Dictionary for sEntiment Reasoning’). Vader is open source software, so you can inspect the code and modify it if you wish. In this week’s lab we will mainly refer to the lexicon.

The following cells imports libraries and creates a SentimentIntensityAnalyzer object.

In [1]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import pandas as pd
pd.set_option('display.max_colwidth', 140)
analyzer = SentimentIntensityAnalyzer()

Read the "About the Scoring" section of the Vader Github README, which explains the scores that are returned by Vader:  
https://github.com/cjhutto/vaderSentiment#about-the-scoring

**QUESTION:** What range of values of the Compound Score should be associated with a "neutral" classification?

*between -0.05 and + 0.05*

## Score some text and understand Vader's lexicon and booster/negation rules

In the cell below is a short phrase to show you the output of Vader. 

First, run it on this text and make sure you understand what each number tells us. 

**ACTIVITY:** Try different text and make sure you understand the scores Vader returns.

Try:
1. A sentence that is obviously positive like "The movie is great"
2. A sentence that uses a "booster" e.g. "The movie is really terrible"
3. A sentence that uses negation e.g. "The movie is not great". 
4. Some sentences that attempts to fool Vader. 

Look at the lexicon and the booster/negation words in code so you get more insight into the scores. 

The main Vader module (including negations and booster words on lines 48-181): https://github.com/nltk/nltk/blob/develop/nltk/sentiment/vader.py 

The Vader lexicon, which you can search in your browser or download and use as a text file:
https://github.com/cjhutto/vaderSentiment/blob/master/vaderSentiment/vader_lexicon.txt 

Make sure you are clear what the values in the Vader lexicon actually mean. Here are some examples for your reference:

    hope 	1.9 0.53852 [3, 2, 2, 1, 2, 2, 1, 2, 2, 2]
    hopeless -2.0 1.78885 [-3, -3, -3, -3, 3, -1, -3, -3, -2, -2]

In [3]:
example = '''
The movie is terrible.
'''
vs = analyzer.polarity_scores(example)
print(vs)

{'neg': 0.508, 'neu': 0.492, 'pos': 0.0, 'compound': -0.4767}


## Scoring a whole review

This is a review from the movie reviews dataset we used last week. 

Run the cell below to get the scores for this movie review.

**ACTIVITY:**
Download the dataset here: https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/corpora/movie_reviews.zip 

Try some different reviews from the dataset and see what scores Vader comes up with. 

**QUESTION:** Are the scores correct against the actual label?

In [15]:
review = '''
Jeff: (00:00)
… pandemic and to prepare for any future threats. HHS is soliciting interest from companies that have experience manufacturing mRNA vaccines to identify opportunities to scale up their production capacity. Importantly, initial production could provide more mRNA COVID vaccines for the world. The goal of this program is to expand existing capacity by an additional billion doses per year with production starting by the second half of 2022.
Jeff: (00:37)
This program would also help us produce doses within six to nine months of identification of a future pathogen and ensure enough vaccines for all Americans. It would combine the expertise of the US government in basic scientific research with the robust ability of pharmaceutical companies to manufacture mRNA vaccines. We hope companies step up and act quickly to take us up on this opportunity to expand production of mRNA vaccines for the current pandemic and set us up to react quickly to any future pandemic threats.
Jeff: (01:15)
We know vaccinations are the best way to accelerate our path out of the pandemic. And taking a look at the data, we know the president’s plan is producing results. 80% of Americans, 12 and older with at least one shot, 10% of kids already with their first shot just 10 days into our program being at full strength, 250 million doses donated and delivered to the world. We know there’s more work to do, but these milestones represent critical progress and show we are on the right track in our fight against the virus. With that, let me turn it over to Dr. Walensky. Dr. Walensky… Oops. Can I be unmuted? Dr. Walensky is muted.
Start from the top, please. Thank you.
Thank you, Jeff. And good morning, everyone. As always, I’d like to start by walking you through the data. The current seven day daily average of cases is about 83,600 cases per day. The seven day average of hospital admissions is about 5,300 per day. And seven day average daily deaths are about 1,000 per day. Week by week, I present to you the current state of the pandemic on a national scale. These data reflect our entire country across all states, all age groups, and all people and are independent of underlying medical conditions or vaccination status.
At CDC, we look at data more granularly to understand important trends, separated out by different demographic and geographic subgroups. We do so to understand who may be at greater risk of COVID 19, who may be at risk for complications and to assess the waning vaccine effectiveness and what interventions we can put in place to help protect those who are most vulnerable. Studies show that those who are unvaccinated continue to be more likely to be infected, more likely to be in a hospital, and more likely to have severe complications from COVID-19.
In recent weeks, we have also seen additional data that reinforce the importance of COVID-19 boosters for these popular, at higher risk of severe disease, particularly to ensure protection against severe illness and hospitalizations. Those who live in long-term care facilities and adults over age 65 were among the first eligible for vaccination. And as vaccine coverage increased in these groups, we saw emergency department visits decline in both after early and robust vaccination efforts in January and February. We had powerful evidence that demonstrated that vaccines are effective and provide protection against the severe complications of COVID-19, especially in those at risk because of their age or underlying conditions.
Since then, we’ve been watching vaccine effectiveness in this population carefully. Although the highest risk are those people who are unvaccinated, we are seeing an increase in emergency department visits among adults age 65 and older, which are now again higher than they are for younger age groups.
Similarly, we also have new data that look at COVID-19 cases in long-term care facilities from our national healthcare safety network. When we compare rates of COVID-19 disease between those who are vaccinated with two doses, and those who have received a booster dose, the rate of disease is markedly lower for those who to their booster shot demonstrating our boosters are working. FDA is currently evaluating data on the authorization of booster doses for all people over age 18.
As we’ve done before, CDC will quickly review the safety and effectiveness data and make recommendations as soon as we hear from FDA. So we want to reinforce the importance of people who are eligible getting boosted now, especially those at highest risk for severe disease. This time of year, we typically see other respiratory viruses circulating like influenza. Last week’s influenza surveillance report noted an increase in flu activity that could mark the beginning of the influenza season.
We have been anticipating the return of flu viruses this season. If you’re wondering if you should get a flu vaccine, you should. It’ll protect you and your family against the flu. What’s the best gift to give this year? Consider the gift of health. It’s priceless. As we head into the holiday and winter season now is the time to think about protection for ourselves and our families. So many of us miss being with our friends and family last year. For those who are at higher risk of severe illness from COVID-19 and who are eligible for a COVID-19 booster dose, go out now and get your extra booster dose to protect you. And for those who are not yet vaccinated, including our children, teens, and adolescents who are now all eligible for vaccination, getting vaccinated this week will set you up to being fully protected in time for the holidays and by the end of the year. Thank you. I’ll now turn it over to Dr. Fauci.
Dr. Fauci: (06:50)
Thank you very much, Dr. Walensky. What I’d like to do in the next few minutes is underscore some of this scientific data on why vaccines protect namely what is the real world effectiveness of the COVID-19 vaccines. And the reason I say that is that as more and more people get vaccinated, no vaccine is a 100% effective. So we’re hearing reports of even vaccinated people getting infected, having some people to in fact question the effectiveness of vaccines. So let’s take a look at that to see if we can clarify that situation. Next slide.
Dr. Fauci: (07:29)
Let’s just go to different places. Let’s take Texas and look at a comparison of unvaccinated people with vaccinated people. Unvaccinated people were 13 times more likely to become infected than fully vaccinated and unvaccinated people were 20 times more likely to die than fully vaccinated people. And that is the state of Texas. Next slide.
Dr. Fauci: (07:55)
Let’s take a look now at the age range of that, because people may think that when you’re in a certain age range, you may not benefit from vaccines. Look at the left hand part of the slide is the different age groups. Look at the right hand part of the slide in red, which shows you the fold higher death rate in the unvaccinated compared to the vaccinated within that given age group that you see on the left. Next slide.
Dr. Fauci: (08:27)
Let’s go to Indiana and take a look. The blue is those who are fully vaccinated and the red are not fully vaccinated. And we’re comparing new hospital admissions, total admissions to the ICU, and total debts. It’s pretty clear to see the difference between the density of the red versus the density of the blue. Next slide.
Dr. Fauci: (08:56)
Let’s take a look at Virginia and take a look at among over 5 million Virginians who’ve been vaccinated. The hospitalization rate is extraordinarily low at 0.035%. And those who’ve died is 0.0125. Next slide. This isn’t only in the United States. Let’s take a look at New South Wales in Australia. Again, unvaccinated individuals are more than 16 times more likely to end up in the ICU or die during the peak period. So any concern about efficacy in the real world of vaccines, I hope we’ve put that to rest. Next slide.
Dr. Fauci: (09:41)
So what do we have? We have 62 million Americans eligible for vaccines who are still not vaccinated. The data that I show you do not lie. Vaccines protect you, your family, and your community. And importantly, it is not too late as Dr. Walensky has said. Get vaccinated now. And importantly, if you are already vaccinated six months or more ago and eligible for a boost, get a boost because as a matter of fact, the data that I just showed you for vaccinations in general hold true for boosters because the Israelis have shown that when you boost you multifold diminish the likelihood of getting infected, getting sick, or dying. And a recent paper from England, it isn’t only Israel shows for those who got the Pfizer vaccine months earlier, and the efficacy against symptomatic infection decreased to 62.5% with a boost, it went up to 94%. So vaccinations work and boosters optimize the vaccination. Having said that, let us all will enjoy the holidays. Now over to you, Dr. Murphy.
Well, thanks so much Dr. Fauci, and it’s good to be with everyone again today. I want to speak a little bit today about misinformation and our efforts about misinformation. Just a few months ago, my office released a Surgeon General’s advisory on health misinformation, which is false and accurate or misleading information according to the best evidence at the time. Now in this advisory, we declared that health misinformation is an urgent public health threat. Since then, I’ve been inspired by people all over the country who have stepped up to confront misinformation.
I’ve heard from doctors and nurses and pharmacists who have talked to their local schools about vaccines. I’ve heard from local elected officials who are making health misinformation a priority. Just a couple weeks ago, I joined hundreds of people of faith who gathered virtually to hear the truth about the COVID-19 vaccines and to share that with their communities.
So we are continuing to meet people where they are today. And today, we’re also happy to welcome the singer and philanthropist Ciara to the white house, where she will participate in a round table about getting vaccinated with First Lady Dr. Jill Biden. These are just a few of the many encouraging steps that we’ve seen over the last few months, but despite all of that health misinformation remains a threat today. With new milestones, like vaccines for kids ages 5 through 11, we’re seeing new waves of misinformation hitting the inboxes of parents and social media feeds across the country. This isn’t just something that affects a small segment of society. A recent poll indicated that nearly 80% of American adults either believe or aren’t sure about a common COVID [inaudible 00:12:41]. As clinicians, community leaders, and other partners tell me as well, they are continuing to hear people quoting these myths to explain why they won’t get vaccinated.
I’ve even heard some of these myths from my own family members who have received misleading videos and false articles through text chains and social media feeds. And I had to talk to my family members about why this content is harmful, but while it’s clear that stopping misinformation is an urgent task, we recognized it’s not always clear to people what they can do to help. And that’s why last week I released a community toolkit for addressing health misinformation. This toolkit is meant for everyone. From healthcare professionals, to teachers, to librarians, to faith leaders, and really anyone who’s concerned about health misinformation in their community. We have two key goals with this toolkit. The first is for people to learn about health misinformation. So the toolkit includes a basics about what it is, how it’s impacting us, and why it’s so tempting to share.
The second goal of the toolkit is for people to apply what they learn. And that’s why we’ve included interactive activities in the toolkit. There are graphics, checklists, five minute exercises, even a comic stream. And we’ve included tips on how to identify misinformation and how to talk with friends and family about health misinformation. This is something, in fact, which might come in handy around the dinner table, this Thanksgiving.
The toolkit is easy to read. It’s illustrated. It’s designed to give people the power to stop health misinformation in their communities. You can view the advisory and the toolkit at surgeongeneral.gov/healthmisinformation. I also want to be clear that our country still needs technology companies to step up and stop the avalanche of health misinformation on their platforms in order to fully turn the tide on health misinformation. We not only need individual action, but we also need these companies to move faster and more effectively than they’re currently moving.
In the end, we all have the power to protect people in our lives from health misinformation, whether we’re able to reach just a few trusted family members and friends, or a few million people through our platform. We can all make an impact on this urgent issue because the bottom line is that health misinformation takes away our freedom to make informed decisions about our health based on facts and science. The toolkit is one more step we are taking to help Americans recognize and stop health misinformation, and help people around them do the same. It will help us restore our freedom to make decisions about our health that are grounded in accurate science based information. Thanks so much for your time today and I’ll turn it back to you, Jeff.
Jeff: (15:19)
Well, thank you doctors. Let’s open it up for a few questions, Kevin?
Kevin: (15:24)
Thanks Jeff. First question. Let’s go to Kristin Walker at NBC.
Hi everyone. And thanks so much for doing the call. Really appreciate it. Two questions. One, can you bring us up to date on any data that you might have about breakthrough cases with kids? And number two, can you weigh in on or reflect upon the fact that you have some areas that are lifting mask mandates and others that are reinforcing them? For example, DC is set to lift its mask mandate indoors. Whereas, you have other areas in the region that are set to reimpose them. What should we make of that? And should there be a uniformity as it relates to masks?
Jeff: (16:10)
Okay. Dr. Walensky first question was is data on breakthrough with kids?
Yeah. Thank you for those questions, Kristen. So we are actively following vaccine effectiveness in our adolescence, our age 12 and up where we have more data and we’ll be updating our data on vaccine effectiveness later this week. In the meantime, we are actively also following data on our five to 11 year olds. Obviously, we need some time for them to get their first dose, get their second dose in order to follow those breakthrough cases. But we are actively collecting those data.
Jeff: (16:42)
Dr. Walensky, there is strong data that I think the CDC has cited about hospitalizations of kids that are vaccinated versus unvaccinated. Is it worth reviewing that data?
So certainly in terms of our adolescences?
Jeff: (16:58)
Yeah. So certainly the protection from our adolescence as we’ve seen is more than tenfold prevention of hospitalizations and deaths among our adolescent group. And we expect that same kind of extremely robust coverage for our younger group as well.
Jeff: (17:15)
Then the second question was the masking mandates.
Yeah, absolutely. So we currently still have over 85% of our counties in this country that are in substantial or high transmission. And the CDC guidance, first of all, obviously getting vaccinated, but the CDC guidance does recommend that jurisdictions be in the moderate or low transmission community transmission for several weeks before releasing mask requirements.
Jeff: (17:42)
Let’s go to Victoria Knight at Kaiser Health News.
Hey, thanks so much for taking my question. I actually have two. So first of all, I’ve talked to experts. I’ve looked at the data myself. I don’t see a lot of evidence to support getting a booster for young people, mostly in the 18 to 39 age range, because there’s just like not a lot of evidence that it protects from severe disease or hospitalization. It seems like that group of people is already pretty well protected. So can you talk about what evidence we might see if the FDA does decide to authorize a third booster for everyone? And then I’m also wondering if boosters do get authorized for everyone, are we moving to kind of a post fullback society? Are workplaces, are businesses going to need to ask for proof of boosters? Is that the direction that we’re moving in if we do decide that the whole public needs boosters?
Jeff: (18:45)
Good, Dr. Fauci, why don’t you start with the evidence around younger people and boosters?
Dr. Fauci: (18:50)
Yeah, I think we better be careful to not make too sharp a distinction between protecting against infection that’s symptomatic versus protection against hospitalization and deaths. Obviously, young individuals have a much less of a likelihood of progressing to severe disease than elderly individuals and adults. However, the children do get infected and they do get mild and sometimes moderate illness.
Dr. Fauci: (19:25)
So I don’t know of any other vaccine that we only worry about keeping people out of the hospital. I think an important thing is to prevent people from getting symptomatic disease. And I think there are plenty of children, adolescents and otherwise who clearly get infected, get symptomatic disease, and some even go on to long COVID. So there’s a really good reason to optimally protect younger individuals, in addition to obviously emphasizing the very strong importance of making sure that more vulnerable people, namely the elderly and those with underlying conditions, not only get their vaccination, but also get their booster. Back to you, Jeff.
Jeff: (20:12)
Dr. Walensky, the second question was really the definition of fully vaccinated.
Yeah. Great. Thank you. So the definition of fully vaccinated is two doses of a Moderna or a Pfizer vaccine as well as one dose of a J&J vaccine. Thank you.
Jeff: (20:27)
Miller, AP.
Miller: (20:33)
Thanks for doing this. As just from a 30,000 foot level as a number of public health officials across the country and around the world are saying that COVID-19 is becoming endemic, what is now the US government’s goal? What is the objective now in controlling the COVID 19 pandemic? Is it stamping it out entirely? Is it coming up with a way for the country to live with it going forward? And then secondly, given that the data that Dr. Fauci showed about the effectiveness of vaccines of preventing serious illness and death, how much longer will the public health protection measures that Dr. Walensky was talking about along lines of masking and the like, how long should those stay in place for when vaccines are now widely available for 90 odd percent of the country’s population?
Jeff: (21:25)
Dr. Fauci?
Dr. Fauci: (21:28)
Well, when you look at any kind of an outbreak, as I’ve said multiple times, but I’ll very briefly repeat it. You know, there’s the pandemic phase, the deceleration phase, control, elimination, and eradication. I don’t think we’re going to get eradication. We’ve only done that with smallpox. We’ve eliminated diseases by vaccination like polio in the United States, as it exists other places. We’ve eliminated measles in the United States. It exists other places. We’ve eliminated malaria years and years ago, but it exists in other places. So I don’t think we’re going to eliminate it completely.
Dr. Fauci: (22:02)
We want control. And I think the confusion is at what level of control are you going to accept it in its endemicity? And as far as we’re concerned, we don’t know really what that number is, but we will know it when we get there. It certainly is far, far lower than 80,000 new infections per day, and is far, far lower than 1,000 deaths per day, and tens of thousands of hospitalizations. So even though there’s a wide bracket under control, we want to get to the lowest possible level than we can get. And rather than picking an arbitrary number, why don’t we get as many people as we can get vaccinated, vaccinated as quickly as possible, and get as many people who are eligible for booster, getting boostered as possible. And when we get to that low level, we will know it rather than picking out an arbitrary number. Back to you, Jeff.
Jeff: (22:55)
Thank you. Thanks Dr. Fauci. Next question.
Kevin: (23:01)
Let’s go to Alex Alpert at Lawyers.
Jeff: (23:15)
Alex?
Jeff: (23:22)
[crosstalk 00:23:22].
Can you hear me?
Jeff: (23:23)
Now, we can. Yes. Thanks.
Thank you so much. I just wanted to ask the funds for the billion dollars in extra doses are only going to the mRNA vaccine makers. That loosefully includes Pfizer BioNTech and Moderna, or are there others? Why are you limiting it as such?
Jeff: (23:41)
Well, we do want to create the capability to have a billion doses of mRNA vaccine produced. That’s incremental capacity to what we have today, where the first application is likely to be used to produce more COVID-19 vaccines for the world. And then we have this ability for any future threat to produce mRNA vaccines, to counter that threat. The companies that you mentioned are companies that currently produce mRNA vaccine, but there are also other companies that are subcontractors of those companies. So we envision a wide array of companies responding to HHS’s request for information. Next question.
Kevin: (24:31)
Last question. Let’s go to Cheyenne Haslet at ABC News.
Hi. Thanks for taking my question. My first question is on boosters, J&J boosters are already recommended for all adults. Pfizer boosters for all adults are expected by a Friday to be authorized. Will the FDA consider Moderna in this upcoming authorization too? And my second question is on the confusing patchwork of booster recommendations in different states this past week. And I’m wondering why have the CDC and FDA not spoken out to give uniformed guidance as that has happened? Thank you.
Jeff: (25:08)
Dr. Walensky?
Yeah. Thank you for that question. Cheyenne, maybe I’ll just answer them all together and say we are actively following the science and the data. We know now that tens of millions of Americans are eligible for boosters, and we are encouraging everyone who is eligible and especially those who are most vulnerable, those over the age of 65, those with underlying medical conditions to go and get your booster right now. And as you noted, the FDA is actively reviewing data and we are in close touch with them. As soon as the FDA reviews those data and provides an authorization, we at CDC will act swiftly. We will be reviewing the epidemiologic data, the effectiveness data, as well as the safety data, and we will provide our recommendations as soon as we can.
Jeff: (25:57)
Well, thank you for today’s briefing. We look forward to the next one. Thank you everybody.
Thank you.


'''
print(review)


Jeff: (00:00)
… pandemic and to prepare for any future threats. HHS is soliciting interest from companies that have experience manufacturing mRNA vaccines to identify opportunities to scale up their production capacity. Importantly, initial production could provide more mRNA COVID vaccines for the world. The goal of this program is to expand existing capacity by an additional billion doses per year with production starting by the second half of 2022.
Jeff: (00:37)
This program would also help us produce doses within six to nine months of identification of a future pathogen and ensure enough vaccines for all Americans. It would combine the expertise of the US government in basic scientific research with the robust ability of pharmaceutical companies to manufacture mRNA vaccines. We hope companies step up and act quickly to take us up on this opportunity to expand production of mRNA vaccines for the current pandemic and set us up to react quickly to any future pandemic threats.
Jeff: (0

In [16]:
vs = analyzer.polarity_scores(review)
print(str(vs))

{'neg': 0.066, 'neu': 0.796, 'pos': 0.139, 'compound': 0.9999}


The compound scores are accurate more often than not, but accuracy is not great on these long texts (around 65%). Software like Vader works better on short texts. This is what it was designed for. We can use this functionality to understand some of the problems deriving overall sentiment scores using a lexicon-based approach and some of the challenges of measuring sentiment more generally.

## Looking at sentiment scores for each sentence

Let’s look at an example review to think about the different frames of reference to which sentiments might be connected. The example we will use is a review of Neil Jordan’s film The Butcher Boy filename cv079_11933.txt. 

A descriptive statement describes the content of the film. Eg sentence 3: Francie is a “sick, needy child” - this tells us about what happens in the film.

An analytic statement analyses the content of the film. 

Eg sentence 3: “I found it difficult to laugh at some of Francie’s darkly comic shenanigans” - here the reviewer is analysing the effects of the film.

It’s not a perfect distinction, but we can observe that negative content in the film doesn’t necessarily imply a negative review of the film. Both types of statements can include evaluative language and include indications of the reviewer's point of view about the movie, but lexicon-based sentiment analysis will have difficulty if a review has a lot of “negative” content, but is nonetheless given a positive review.

**ACTIVITY:** Run the following cells to get scores for each sentence.

In [17]:
review = '''
Jeff: (00:00)
… pandemic and to prepare for any future threats. HHS is soliciting interest from companies that have experience manufacturing mRNA vaccines to identify opportunities to scale up their production capacity. Importantly, initial production could provide more mRNA COVID vaccines for the world. The goal of this program is to expand existing capacity by an additional billion doses per year with production starting by the second half of 2022.
Jeff: (00:37)
This program would also help us produce doses within six to nine months of identification of a future pathogen and ensure enough vaccines for all Americans. It would combine the expertise of the US government in basic scientific research with the robust ability of pharmaceutical companies to manufacture mRNA vaccines. We hope companies step up and act quickly to take us up on this opportunity to expand production of mRNA vaccines for the current pandemic and set us up to react quickly to any future pandemic threats.
Jeff: (01:15)
We know vaccinations are the best way to accelerate our path out of the pandemic. And taking a look at the data, we know the president’s plan is producing results. 80% of Americans, 12 and older with at least one shot, 10% of kids already with their first shot just 10 days into our program being at full strength, 250 million doses donated and delivered to the world. We know there’s more work to do, but these milestones represent critical progress and show we are on the right track in our fight against the virus. With that, let me turn it over to Dr. Walensky. Dr. Walensky… Oops. Can I be unmuted? Dr. Walensky is muted.
Start from the top, please. Thank you.
Thank you, Jeff. And good morning, everyone. As always, I’d like to start by walking you through the data. The current seven day daily average of cases is about 83,600 cases per day. The seven day average of hospital admissions is about 5,300 per day. And seven day average daily deaths are about 1,000 per day. Week by week, I present to you the current state of the pandemic on a national scale. These data reflect our entire country across all states, all age groups, and all people and are independent of underlying medical conditions or vaccination status.
At CDC, we look at data more granularly to understand important trends, separated out by different demographic and geographic subgroups. We do so to understand who may be at greater risk of COVID 19, who may be at risk for complications and to assess the waning vaccine effectiveness and what interventions we can put in place to help protect those who are most vulnerable. Studies show that those who are unvaccinated continue to be more likely to be infected, more likely to be in a hospital, and more likely to have severe complications from COVID-19.
In recent weeks, we have also seen additional data that reinforce the importance of COVID-19 boosters for these popular, at higher risk of severe disease, particularly to ensure protection against severe illness and hospitalizations. Those who live in long-term care facilities and adults over age 65 were among the first eligible for vaccination. And as vaccine coverage increased in these groups, we saw emergency department visits decline in both after early and robust vaccination efforts in January and February. We had powerful evidence that demonstrated that vaccines are effective and provide protection against the severe complications of COVID-19, especially in those at risk because of their age or underlying conditions.
Since then, we’ve been watching vaccine effectiveness in this population carefully. Although the highest risk are those people who are unvaccinated, we are seeing an increase in emergency department visits among adults age 65 and older, which are now again higher than they are for younger age groups.
Similarly, we also have new data that look at COVID-19 cases in long-term care facilities from our national healthcare safety network. When we compare rates of COVID-19 disease between those who are vaccinated with two doses, and those who have received a booster dose, the rate of disease is markedly lower for those who to their booster shot demonstrating our boosters are working. FDA is currently evaluating data on the authorization of booster doses for all people over age 18.
As we’ve done before, CDC will quickly review the safety and effectiveness data and make recommendations as soon as we hear from FDA. So we want to reinforce the importance of people who are eligible getting boosted now, especially those at highest risk for severe disease. This time of year, we typically see other respiratory viruses circulating like influenza. Last week’s influenza surveillance report noted an increase in flu activity that could mark the beginning of the influenza season.
We have been anticipating the return of flu viruses this season. If you’re wondering if you should get a flu vaccine, you should. It’ll protect you and your family against the flu. What’s the best gift to give this year? Consider the gift of health. It’s priceless. As we head into the holiday and winter season now is the time to think about protection for ourselves and our families. So many of us miss being with our friends and family last year. For those who are at higher risk of severe illness from COVID-19 and who are eligible for a COVID-19 booster dose, go out now and get your extra booster dose to protect you. And for those who are not yet vaccinated, including our children, teens, and adolescents who are now all eligible for vaccination, getting vaccinated this week will set you up to being fully protected in time for the holidays and by the end of the year. Thank you. I’ll now turn it over to Dr. Fauci.
Dr. Fauci: (06:50)
Thank you very much, Dr. Walensky. What I’d like to do in the next few minutes is underscore some of this scientific data on why vaccines protect namely what is the real world effectiveness of the COVID-19 vaccines. And the reason I say that is that as more and more people get vaccinated, no vaccine is a 100% effective. So we’re hearing reports of even vaccinated people getting infected, having some people to in fact question the effectiveness of vaccines. So let’s take a look at that to see if we can clarify that situation. Next slide.
Dr. Fauci: (07:29)
Let’s just go to different places. Let’s take Texas and look at a comparison of unvaccinated people with vaccinated people. Unvaccinated people were 13 times more likely to become infected than fully vaccinated and unvaccinated people were 20 times more likely to die than fully vaccinated people. And that is the state of Texas. Next slide.
Dr. Fauci: (07:55)
Let’s take a look now at the age range of that, because people may think that when you’re in a certain age range, you may not benefit from vaccines. Look at the left hand part of the slide is the different age groups. Look at the right hand part of the slide in red, which shows you the fold higher death rate in the unvaccinated compared to the vaccinated within that given age group that you see on the left. Next slide.
Dr. Fauci: (08:27)
Let’s go to Indiana and take a look. The blue is those who are fully vaccinated and the red are not fully vaccinated. And we’re comparing new hospital admissions, total admissions to the ICU, and total debts. It’s pretty clear to see the difference between the density of the red versus the density of the blue. Next slide.
Dr. Fauci: (08:56)
Let’s take a look at Virginia and take a look at among over 5 million Virginians who’ve been vaccinated. The hospitalization rate is extraordinarily low at 0.035%. And those who’ve died is 0.0125. Next slide. This isn’t only in the United States. Let’s take a look at New South Wales in Australia. Again, unvaccinated individuals are more than 16 times more likely to end up in the ICU or die during the peak period. So any concern about efficacy in the real world of vaccines, I hope we’ve put that to rest. Next slide.
Dr. Fauci: (09:41)
So what do we have? We have 62 million Americans eligible for vaccines who are still not vaccinated. The data that I show you do not lie. Vaccines protect you, your family, and your community. And importantly, it is not too late as Dr. Walensky has said. Get vaccinated now. And importantly, if you are already vaccinated six months or more ago and eligible for a boost, get a boost because as a matter of fact, the data that I just showed you for vaccinations in general hold true for boosters because the Israelis have shown that when you boost you multifold diminish the likelihood of getting infected, getting sick, or dying. And a recent paper from England, it isn’t only Israel shows for those who got the Pfizer vaccine months earlier, and the efficacy against symptomatic infection decreased to 62.5% with a boost, it went up to 94%. So vaccinations work and boosters optimize the vaccination. Having said that, let us all will enjoy the holidays. Now over to you, Dr. Murphy.
Well, thanks so much Dr. Fauci, and it’s good to be with everyone again today. I want to speak a little bit today about misinformation and our efforts about misinformation. Just a few months ago, my office released a Surgeon General’s advisory on health misinformation, which is false and accurate or misleading information according to the best evidence at the time. Now in this advisory, we declared that health misinformation is an urgent public health threat. Since then, I’ve been inspired by people all over the country who have stepped up to confront misinformation.
I’ve heard from doctors and nurses and pharmacists who have talked to their local schools about vaccines. I’ve heard from local elected officials who are making health misinformation a priority. Just a couple weeks ago, I joined hundreds of people of faith who gathered virtually to hear the truth about the COVID-19 vaccines and to share that with their communities.
So we are continuing to meet people where they are today. And today, we’re also happy to welcome the singer and philanthropist Ciara to the white house, where she will participate in a round table about getting vaccinated with First Lady Dr. Jill Biden. These are just a few of the many encouraging steps that we’ve seen over the last few months, but despite all of that health misinformation remains a threat today. With new milestones, like vaccines for kids ages 5 through 11, we’re seeing new waves of misinformation hitting the inboxes of parents and social media feeds across the country. This isn’t just something that affects a small segment of society. A recent poll indicated that nearly 80% of American adults either believe or aren’t sure about a common COVID [inaudible 00:12:41]. As clinicians, community leaders, and other partners tell me as well, they are continuing to hear people quoting these myths to explain why they won’t get vaccinated.
I’ve even heard some of these myths from my own family members who have received misleading videos and false articles through text chains and social media feeds. And I had to talk to my family members about why this content is harmful, but while it’s clear that stopping misinformation is an urgent task, we recognized it’s not always clear to people what they can do to help. And that’s why last week I released a community toolkit for addressing health misinformation. This toolkit is meant for everyone. From healthcare professionals, to teachers, to librarians, to faith leaders, and really anyone who’s concerned about health misinformation in their community. We have two key goals with this toolkit. The first is for people to learn about health misinformation. So the toolkit includes a basics about what it is, how it’s impacting us, and why it’s so tempting to share.
The second goal of the toolkit is for people to apply what they learn. And that’s why we’ve included interactive activities in the toolkit. There are graphics, checklists, five minute exercises, even a comic stream. And we’ve included tips on how to identify misinformation and how to talk with friends and family about health misinformation. This is something, in fact, which might come in handy around the dinner table, this Thanksgiving.
The toolkit is easy to read. It’s illustrated. It’s designed to give people the power to stop health misinformation in their communities. You can view the advisory and the toolkit at surgeongeneral.gov/healthmisinformation. I also want to be clear that our country still needs technology companies to step up and stop the avalanche of health misinformation on their platforms in order to fully turn the tide on health misinformation. We not only need individual action, but we also need these companies to move faster and more effectively than they’re currently moving.
In the end, we all have the power to protect people in our lives from health misinformation, whether we’re able to reach just a few trusted family members and friends, or a few million people through our platform. We can all make an impact on this urgent issue because the bottom line is that health misinformation takes away our freedom to make informed decisions about our health based on facts and science. The toolkit is one more step we are taking to help Americans recognize and stop health misinformation, and help people around them do the same. It will help us restore our freedom to make decisions about our health that are grounded in accurate science based information. Thanks so much for your time today and I’ll turn it back to you, Jeff.
Jeff: (15:19)
Well, thank you doctors. Let’s open it up for a few questions, Kevin?
Kevin: (15:24)
Thanks Jeff. First question. Let’s go to Kristin Walker at NBC.
Hi everyone. And thanks so much for doing the call. Really appreciate it. Two questions. One, can you bring us up to date on any data that you might have about breakthrough cases with kids? And number two, can you weigh in on or reflect upon the fact that you have some areas that are lifting mask mandates and others that are reinforcing them? For example, DC is set to lift its mask mandate indoors. Whereas, you have other areas in the region that are set to reimpose them. What should we make of that? And should there be a uniformity as it relates to masks?
Jeff: (16:10)
Okay. Dr. Walensky first question was is data on breakthrough with kids?
Yeah. Thank you for those questions, Kristen. So we are actively following vaccine effectiveness in our adolescence, our age 12 and up where we have more data and we’ll be updating our data on vaccine effectiveness later this week. In the meantime, we are actively also following data on our five to 11 year olds. Obviously, we need some time for them to get their first dose, get their second dose in order to follow those breakthrough cases. But we are actively collecting those data.
Jeff: (16:42)
Dr. Walensky, there is strong data that I think the CDC has cited about hospitalizations of kids that are vaccinated versus unvaccinated. Is it worth reviewing that data?
So certainly in terms of our adolescences?
Jeff: (16:58)
Yeah. So certainly the protection from our adolescence as we’ve seen is more than tenfold prevention of hospitalizations and deaths among our adolescent group. And we expect that same kind of extremely robust coverage for our younger group as well.
Jeff: (17:15)
Then the second question was the masking mandates.
Yeah, absolutely. So we currently still have over 85% of our counties in this country that are in substantial or high transmission. And the CDC guidance, first of all, obviously getting vaccinated, but the CDC guidance does recommend that jurisdictions be in the moderate or low transmission community transmission for several weeks before releasing mask requirements.
Jeff: (17:42)
Let’s go to Victoria Knight at Kaiser Health News.
Hey, thanks so much for taking my question. I actually have two. So first of all, I’ve talked to experts. I’ve looked at the data myself. I don’t see a lot of evidence to support getting a booster for young people, mostly in the 18 to 39 age range, because there’s just like not a lot of evidence that it protects from severe disease or hospitalization. It seems like that group of people is already pretty well protected. So can you talk about what evidence we might see if the FDA does decide to authorize a third booster for everyone? And then I’m also wondering if boosters do get authorized for everyone, are we moving to kind of a post fullback society? Are workplaces, are businesses going to need to ask for proof of boosters? Is that the direction that we’re moving in if we do decide that the whole public needs boosters?
Jeff: (18:45)
Good, Dr. Fauci, why don’t you start with the evidence around younger people and boosters?
Dr. Fauci: (18:50)
Yeah, I think we better be careful to not make too sharp a distinction between protecting against infection that’s symptomatic versus protection against hospitalization and deaths. Obviously, young individuals have a much less of a likelihood of progressing to severe disease than elderly individuals and adults. However, the children do get infected and they do get mild and sometimes moderate illness.
Dr. Fauci: (19:25)
So I don’t know of any other vaccine that we only worry about keeping people out of the hospital. I think an important thing is to prevent people from getting symptomatic disease. And I think there are plenty of children, adolescents and otherwise who clearly get infected, get symptomatic disease, and some even go on to long COVID. So there’s a really good reason to optimally protect younger individuals, in addition to obviously emphasizing the very strong importance of making sure that more vulnerable people, namely the elderly and those with underlying conditions, not only get their vaccination, but also get their booster. Back to you, Jeff.
Jeff: (20:12)
Dr. Walensky, the second question was really the definition of fully vaccinated.
Yeah. Great. Thank you. So the definition of fully vaccinated is two doses of a Moderna or a Pfizer vaccine as well as one dose of a J&J vaccine. Thank you.
Jeff: (20:27)
Miller, AP.
Miller: (20:33)
Thanks for doing this. As just from a 30,000 foot level as a number of public health officials across the country and around the world are saying that COVID-19 is becoming endemic, what is now the US government’s goal? What is the objective now in controlling the COVID 19 pandemic? Is it stamping it out entirely? Is it coming up with a way for the country to live with it going forward? And then secondly, given that the data that Dr. Fauci showed about the effectiveness of vaccines of preventing serious illness and death, how much longer will the public health protection measures that Dr. Walensky was talking about along lines of masking and the like, how long should those stay in place for when vaccines are now widely available for 90 odd percent of the country’s population?
Jeff: (21:25)
Dr. Fauci?
Dr. Fauci: (21:28)
Well, when you look at any kind of an outbreak, as I’ve said multiple times, but I’ll very briefly repeat it. You know, there’s the pandemic phase, the deceleration phase, control, elimination, and eradication. I don’t think we’re going to get eradication. We’ve only done that with smallpox. We’ve eliminated diseases by vaccination like polio in the United States, as it exists other places. We’ve eliminated measles in the United States. It exists other places. We’ve eliminated malaria years and years ago, but it exists in other places. So I don’t think we’re going to eliminate it completely.
Dr. Fauci: (22:02)
We want control. And I think the confusion is at what level of control are you going to accept it in its endemicity? And as far as we’re concerned, we don’t know really what that number is, but we will know it when we get there. It certainly is far, far lower than 80,000 new infections per day, and is far, far lower than 1,000 deaths per day, and tens of thousands of hospitalizations. So even though there’s a wide bracket under control, we want to get to the lowest possible level than we can get. And rather than picking an arbitrary number, why don’t we get as many people as we can get vaccinated, vaccinated as quickly as possible, and get as many people who are eligible for booster, getting boostered as possible. And when we get to that low level, we will know it rather than picking out an arbitrary number. Back to you, Jeff.
Jeff: (22:55)
Thank you. Thanks Dr. Fauci. Next question.
Kevin: (23:01)
Let’s go to Alex Alpert at Lawyers.
Jeff: (23:15)
Alex?
Jeff: (23:22)
[crosstalk 00:23:22].
Can you hear me?
Jeff: (23:23)
Now, we can. Yes. Thanks.
Thank you so much. I just wanted to ask the funds for the billion dollars in extra doses are only going to the mRNA vaccine makers. That loosefully includes Pfizer BioNTech and Moderna, or are there others? Why are you limiting it as such?
Jeff: (23:41)
Well, we do want to create the capability to have a billion doses of mRNA vaccine produced. That’s incremental capacity to what we have today, where the first application is likely to be used to produce more COVID-19 vaccines for the world. And then we have this ability for any future threat to produce mRNA vaccines, to counter that threat. The companies that you mentioned are companies that currently produce mRNA vaccine, but there are also other companies that are subcontractors of those companies. So we envision a wide array of companies responding to HHS’s request for information. Next question.
Kevin: (24:31)
Last question. Let’s go to Cheyenne Haslet at ABC News.
Hi. Thanks for taking my question. My first question is on boosters, J&J boosters are already recommended for all adults. Pfizer boosters for all adults are expected by a Friday to be authorized. Will the FDA consider Moderna in this upcoming authorization too? And my second question is on the confusing patchwork of booster recommendations in different states this past week. And I’m wondering why have the CDC and FDA not spoken out to give uniformed guidance as that has happened? Thank you.
Jeff: (25:08)
Dr. Walensky?
Yeah. Thank you for that question. Cheyenne, maybe I’ll just answer them all together and say we are actively following the science and the data. We know now that tens of millions of Americans are eligible for boosters, and we are encouraging everyone who is eligible and especially those who are most vulnerable, those over the age of 65, those with underlying medical conditions to go and get your booster right now. And as you noted, the FDA is actively reviewing data and we are in close touch with them. As soon as the FDA reviews those data and provides an authorization, we at CDC will act swiftly. We will be reviewing the epidemiologic data, the effectiveness data, as well as the safety data, and we will provide our recommendations as soon as we can.
Jeff: (25:57)
Well, thank you for today’s briefing. We look forward to the next one. Thank you everybody.
Thank you.


'''

# this splits the review by newlines and removes any empty strings
sentences = []
for sentence in review.splitlines():
    if sentence:
        sentences.append(sentence)

sentences

['Jeff: (00:00)',
 '… pandemic and to prepare for any future threats. HHS is soliciting interest from companies that have experience manufacturing mRNA vaccines to identify opportunities to scale up their production capacity. Importantly, initial production could provide more mRNA COVID vaccines for the world. The goal of this program is to expand existing capacity by an additional billion doses per year with production starting by the second half of 2022.',
 'Jeff: (00:37)',
 'This program would also help us produce doses within six to nine months of identification of a future pathogen and ensure enough vaccines for all Americans. It would combine the expertise of the US government in basic scientific research with the robust ability of pharmaceutical companies to manufacture mRNA vaccines. We hope companies step up and act quickly to take us up on this opportunity to expand production of mRNA vaccines for the current pandemic and set us up to react quickly to any future pandemic thre

In [23]:
df = pd.DataFrame(columns=['sentence','neg','neu','pos','compound'])
for sentence in sentences:
    vs = analyzer.polarity_scores(sentence)
    vs['sentence'] = sentence
    df = df.append(dict(vs), ignore_index=True)
df[20:35]

Unnamed: 0,sentence,neg,neu,pos,compound
20,Dr. Fauci: (08:27),0.0,1.0,0.0,0.0
21,Let’s go to Indiana and take a look. The blue is those who are fully vaccinated and the red are not fully vaccinated. And we’re comparin...,0.0,0.902,0.098,0.7239
22,Dr. Fauci: (08:56),0.0,1.0,0.0,0.0
23,Let’s take a look at Virginia and take a look at among over 5 million Virginians who’ve been vaccinated. The hospitalization rate is ext...,0.092,0.853,0.055,-0.5994
24,Dr. Fauci: (09:41),0.0,1.0,0.0,0.0
25,So what do we have? We have 62 million Americans eligible for vaccines who are still not vaccinated. The data that I show you do not lie...,0.044,0.811,0.144,0.9487
26,"Well, thanks so much Dr. Fauci, and it’s good to be with everyone again today. I want to speak a little bit today about misinformation a...",0.165,0.677,0.158,0.0258
27,I’ve heard from doctors and nurses and pharmacists who have talked to their local schools about vaccines. I’ve heard from local elected ...,0.035,0.854,0.111,0.6124
28,"So we are continuing to meet people where they are today. And today, we’re also happy to welcome the singer and philanthropist Ciara to ...",0.059,0.854,0.087,0.4404
29,I’ve even heard some of these myths from my own family members who have received misleading videos and false articles through text chain...,0.108,0.797,0.095,-0.0722


**ACTIVITY:** Look closely at each sentence and work out which ones relate to the reviewer's evaluation of the movie. 

**QUESTION:** Is Vader doing a good job of scoring these sentences?

**ACTIVITY:** 
Try this with another review. You will need to replace the review text using one of the reviews from the movie reviews dataset you downloaded above and rerun the cells. Look carefully at the positively and negatively evaluated sentences using the compound score. 

**QUESTION:** 
From this analysis, what challenges do you see in correctly assigning overall sentiment scores to movie reviews?
Movie reviews often have mixetures of sentances which are positive and negative. Setniment dependent on the genre and audience

**ACTIVITY:** 
In class this week we discussed how sentiment analysis might not be an appropriate technique for analysing some kinds of texts. For example, some texts are not primarily about presenting a point of view or evaluation (e.g. journalistic texts, scientific writing) and authors/speakers don't always present their evaluations in a straightforward way (e.g. some political texts).  

Take some time to explore some different kinds of texts (e.g. editorials, fiction, tweets, news articles, political speeches, texts from the corpus you built for the Corpus Building Project). Vader will tend to perform better with short texts, so make sure you try texts of different lengths.

**QUESTION:** 
How does Vader perform on different kinds of texts? What kinds of texts are challenging for a lexicon-based approach to sentiment analysis? What kinds of texts are not appropriate for sentiment analysis?


Informational text not appropriate

political texts fluctuate between positive and negative. Politicians answer questions ambiguously


**This is it for the labs for DIGI405! Before you go today – make sure you thank your tutors for all their help and support during the course!**