In this example notebook, we will see how to perform sentiment analysis on multiple Wikipedia in a single job. First, we need to import the analyze function from wiki_sentiment_multi. We will be demonstrating two functions: analyze_list and analyze_csv.

In [1]:
from wiki_sentiment_multi import analyze_list, analyze_csv

First we will look at analyze_list. This will let us analyze multiple URLs specified as strings in a comma-separated list. This option is useful if you just want to do a quick analysis without needing any additional setup. Take a look at the inputs specified below. For our example, lets analyze a list of 3 famous singers.

In [2]:
urls=['https://en.wikipedia.org/wiki/Taylor_Swift',
'https://en.wikipedia.org/wiki/Justin_Bieber',
'https://en.wikipedia.org/wiki/Rihanna'] #List of URLs formatted as strings
models=['bert','roberta','robertuito'] #Should be a string or list of strings. Available choices are 'bert','roberta','robertuito', and 'distilbert'.
output_filename='singers_analysis.csv' #Filename for output csv file. Default is 'analysis.csv'
output_path='path/to/my/file' #Filepath at which csv file is saved. If left unspecified, file will be saved in working location
show_progress=True #Whether or not to display job progress. True by default

Now we're ready to analyze! If show_progress is True, you will see job progress in terms of articles completed.

In [3]:
analyze_list(urls,models,output_filename)

1/3 articles analyzed
2/3 articles analyzed
3/3 articles analyzed


Job complete! The results that are recorded are exactly the same as the ones that we recorded for the single URL case. Next we will look at analyze_csv. This option is useful if you have a larger number of articles to analyze or have an existing table with custom metrics. The analyze_csv function will generate all the same results as before and simply append them to your existing table. The inputs for analyze_csv are mostly the same as for analyze_list, but you must specify the name of your starting .csv file. You can also optionally specify the input path of your .csv file and the name of column in your .csv file containing the article URLs. For this next example, let's perform sentiment analysis for some famous musicians, starting from a .csv file containing only their names and Wikipedia article URLs.

In [4]:
input_filename='beatles.csv' #Filename of a UTF-8 encoded csv file that contains a column of URLs
models=['bert','roberta','robertuito'] #Should be a string or list of strings. Available choices are 'bert','roberta','robertuito', and 'distilbert'.
output_filename='beatles_analysis.csv' #Filename for output csv file. Default is 'analysis.csv'
input_path='path/to/my/file' #Input filepath of csv file. If left unspecified, it is the current working directory
output_path='path/to/my/file' #Filepath at which csv file is saved. If left unspecified, file will be saved in working location
show_progress=True #Whether or not to display job progress. True by default
url_col='URL' #Name of the column in input csv file containing the URLs. 'URL' by default

Time to analyze!

In [5]:
analyze_csv(input_filename,models,output_filename)

1/4 articles analyzed
2/4 articles analyzed
3/4 articles analyzed
4/4 articles analyzed


This example has demonstrated two ways in which multiple Wikipedia articles can be analyzed via sentiment analysis with a single job. With these tools, you're ready to go perform your own study!