Before you start, make sure you have all the required packages installed:

* To create a word cloud: `wordcloud`
* To import an image: `pillow` (will later import is as PIL)
* To scrape text from Wikipedia: `wikipedia`. This one is optional, you can instead load up or create your own text data without having to pull text via web scraping.

Remember, before importing packages, you mast have them installed:

> OPTION 1: For example, in Anaconda Prompt try: `conda install -c conda-forge wikipedia` <br>
> OPTION 2: Directly from Jupyter Notebook or JupyterLab try: `pip install wikipedia` <br>
> OPTION 3: In Anaconda Navigator when we tried searching for packages such as wordcloud or wikipedia, a few were missing from the search. Add a new channel by clicking on 'Channels' button. A small window will popup where you can Add more repositories. Add `conda-forge` - it contains many more packages. I was able to find `wordcloud` and `wikipedia` and istall it manually without `conda` or `pip`.

In [None]:
import sys

In [None]:
!{sys.executable} -m pip install wikipedia

In [None]:
!{sys.executable} -m pip install wordcloud

In [None]:
# Import packages
import wikipedia
import re
import numpy as np
import matplotlib.pyplot as plt

from wordcloud import WordCloud, STOPWORDS
from PIL import Image

# Load data

In [None]:
# Specify the title of the Wikipedia page
wiki = wikipedia.page('Sydney')  # Extract the plain text content of the page
text = wiki.content
print(text)

In [None]:
# Clean text
text = re.sub(r'==.*?==+', '', text)
text = text.replace('\n', '')
print(text)

# Word Cloud

In [None]:
# Define a function to plot word cloud
def plot_cloud(wordcloud):
    plt.figure(figsize=(40, 30)) # Set figure size
    plt.imshow(wordcloud)        # Display image
    plt.axis("off");             # No axis details

In [None]:
# Generate word cloud
wordcloud = WordCloud(width = 3000, height = 2000, random_state=1, \
                      background_color='salmon', colormap='Pastel1', \
                      collocations=False, stopwords = STOPWORDS).generate(text)

# Note the long code line splitting with '\' in above.
# Colors: http://www.science.smith.edu/dftwiki/index.php/Color_Charts_for_TKinter
# Colormaps: https://matplotlib.org/3.2.1/tutorials/colors/colormaps.html

# Plot
plot_cloud(wordcloud)

In [None]:
wordcloud = WordCloud(width = 3000, height = 2000, random_state=1, \
                      background_color='salmon', colormap='Set2', \
                      collocations=False, stopwords = STOPWORDS).generate(text)
plot_cloud(wordcloud)

In [None]:
wordcloud = WordCloud(width = 1000, height = 600, random_state=1, \
                      background_color='black', colormap='Dark2', \
                      collocations=False, stopwords = STOPWORDS).generate(text)
plot_cloud(wordcloud)

# Fancier word cloud

In [None]:
# Import image to np.array
mask = np.array(Image.open('upvote.png'))

In [None]:
wordcloud = WordCloud(width = 3000, height = 2000, random_state=1, \
                      background_color='white', colormap='Dark2', \
                      collocations=False, stopwords = STOPWORDS, mask=mask).generate(text)
plot_cloud(wordcloud)

In [None]:
wordcloud = WordCloud(width = 3000, height = 2000, random_state=1, \
                      background_color='black', colormap='Dark2', \
                      collocations=False, stopwords = STOPWORDS, mask=mask).generate(text)
plot_cloud(wordcloud)