# Working with Publically Available Pre-Trained Models

Within this notebook we will focus on the use of publically available pre-trained models.

More specifically we will use the the **Diffusors Library** and **Transformers Library** by [Hugging Face](https://huggingface.co/). These libraries consist of thousands of pre-trained models many of which are trained on huge datasets for thousands of GPU hours. You can use them either directly for inference (as we will do in this lab session) or fine-tune them for your specific applications.

**Pre-trained models versus building models from scratch:** Using pre-trained models allows you to reduce your compute costs and carbon footprint and save time and resources required to develop a model from scratch.

**Pre-trained models versus APIs:** Compared to APIs these libraries are a bit more difficult to use, while they provide you with more control.

The notebook also shows that libraries and APIs can be used flexibly combined and used hand in hand.

# Working with Stable Difussion for Image Generation

This section shows how to use Stable Diffussion  with the 🤗 Hugging Face [🧨 Diffusers library](https://github.com/huggingface/diffusers). The library can be used to create images from textual prompts with just a few lines of code.

This section of the notebook has been adapted from [here](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb).

### Seting Up the Stable Diffusion Pipeline

Make sure you are using a GPU runtime to run this notebook, so inference is much faster. If the following command fails, use the `Runtime` menu above and select `Change runtime type`.

In [None]:
!nvidia-smi

In [None]:
device = "cuda"
#Note that you need to replace cuda with cpu should you not get a GPU allocated

Next, you should install `diffusers` as well `scipy`, `ftfy` and `transformers`. `accelerate` is used to achieve much faster loading.

Moreover, we install with `transformers[sentencepiece] `, `openai` and `google-cloud-vision` some additionally packages which we need in later parts of this notebook.

In [1]:
!pip install --upgrade git+https://github.com/huggingface/diffusers.git transformers transformers[sentencepiece] scipy ftfy accelerate openai google-cloud-vision
#Note that this step can take a while

Collecting git+https://github.com/huggingface/diffusers.git
  Cloning https://github.com/huggingface/diffusers.git to /tmp/pip-req-build-r53okm3z
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/diffusers.git /tmp/pip-req-build-r53okm3z
  Resolved https://github.com/huggingface/diffusers.git to commit 5e96333cb2637fd5fb1fe76b00946555b491fb6d
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting transformers
  Downloading transformers-4.37.0-py3-none-any.whl (8.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.4/8.4 MB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
Collecting scipy
  Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m38.4/38.4 MB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
[?25hColle

`StableDiffusionPipeline` is an end-to-end inference pipeline that you can use to generate images from text with just a few lines of code. To use the `StableDiffusionPipeline` we have to first import it along with `torch`.

In [None]:
from diffusers import StableDiffusionPipeline
import torch

Next, we load the pre-trained weights of all components of the model. In this notebook we use Stable Diffusion version 1.4 ([CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4)), but there are other variants that you may want to try:
* [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
* [stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base)
* [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1). This version can produce images with a resolution of 768x768, while the others work at 512x512.

In addition to the model id (i.e., [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4)), we pass `torch_dtype` to the `from_pretrained` method.

To ensure that every free Google Colab can run Stable Diffusion, we load the weights from the half-precision branch [`fp16`](https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/fp16) and also tell `diffusers` to expect the weights in float16 precision by passing `torch_dtype=torch.float16`.

If you want to ensure the highest possible precision, remove `torch_dtype=torch.float16` at the cost of a higher memory usage.

In [None]:
model_id = "CompVis/stable-diffusion-v1-4"

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

#Note that you need to replace float16 with float32 should you not get a GPU allocated and device="cpu"

Next, we move the pipeline to GPU (device="cuda") to have faster inference.




In [None]:
pipe = pipe.to(device)

## Generating Images with Stable Diffusion

### Image Generation with Stable Diffusion

In [None]:
# function to generate image in [PIL format](https://pillow.readthedocs.io/en/stable/)
def generate_image(prompt):
  image = pipe(prompt).images[0]
  image.save("astronaut_rides_horse.png")
  return image

In [None]:
prompt = "a photograph of an astronaut riding a horse"

image = generate_image(prompt)

image

Running the above cell multiple times will give you a different image every time. For **reproducibiltiy** use a `Generator` and set a manual seed. Every time you use the same seed you will have the same image result.

In [None]:
generator = torch.Generator(device).manual_seed(1024)

image = pipe(prompt, generator=generator).images[0]

image

Getting the `DiffusionPipeline` to generate images in a certain style or include what you want can be tricky. Typically, it is necessary to run the `DiffusionPipeline` several times before ending up with an image you are satisfied with. Thus, it is important to reduce the time between inference cycles so you can iterate faster by getting the best speed and memory efficiency from the pipeline.

For some tipps and tricks on effective and efficient Stable Difussion visit [here](https://huggingface.co/docs/diffusers/stable_diffusion).

### Efficient Image Generation

By default, the DiffusionPipeline runs inference with full `float32` precision for 50 inference steps. We already switched to a lower precision like `float16`.

Another option for speed up is to reduce the number of inference steps using the `num_inference_steps` argument. In general, results are better the more steps you use.

However, choosing a more efficient scheduler could help decrease the number of steps without sacrificing output quality.

The Stable Diffusion model uses the `PNDMScheduler` by default which usually requires ~50 inference steps. However, there are more performant schedulers (e.g.,  `EulerDiscreteScheduler` which can do with less inference steps). You might need to experiment a bit to find the optimal range of steps for the different schedulers (for more details see [here](https://www.youtube.com/watch?v=N5ZAMa3BUxc))


To get the list of schedulers compatible with the used model use `pipe.scheduler.compatibles`.

In [None]:
pipe.scheduler.compatibles


To swap out the noise scheduler, pass it to `from_pre-trained`.

In [None]:
from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler

model_id = "CompVis/stable-diffusion-v1-4"

#Use the Euler scheduler here instead
scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16)
pipe = pipe.to(device)

In [None]:
generator = torch.Generator(device).manual_seed(1024)
image = pipe(prompt, num_inference_steps=30, generator=generator).images[0]
display (image)

To generate multiple images for for the same prompt with a different number of steps, we simply call the pipeline multiple times with different step sizes and store the resulting images in a list.

To show the generated images in a grid we use a helper function called `image_grid` which creates a grid of images.

In [None]:
#Helper function to create a grid of images
from PIL import Image

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

In [None]:
import torch

#Initially empt list images
images = []

#This loop creates images using 10 to 60 steps; the created images are added to list images
for steps in range(10,61,10):
  generator = torch.Generator(device).manual_seed(1024)
  image = pipe(prompt, num_inference_steps=steps, generator=generator).images[0]
  images.append(image)

#displays the created list of images as grid
image_grid(images, 2, 3)

## Comparing Stable Difussion and OpenAI Images API

In the next part of the notebook we compare Stable Diffussion with OpenAI's Images API.

In [None]:
from typing_extensions import Protocol
import os
from pathlib import Path
from openai import OpenAI

os.environ['OPENAI_API_KEY'] = 'COPY THE API_KEY HERE'

client = OpenAI()

In [None]:
from openai import OpenAI
client = OpenAI()

In [None]:
# function to generate image using OpenAI Images API
def generate_OpenAI_image(prompt):
  response = client.images.generate(
    model="dall-e-3",
    prompt=prompt,
    size="1024x1024",
    quality="standard",
    n=1,
  )

  return response.data[0].url

In [None]:
# function to generate image in [PIL format](https://pillow.readthedocs.io/en/stable/) using Stable Diffusion
def generate_SD_image(prompt):
  generator=torch.Generator(device).manual_seed(1024)
  image = pipe(prompt, num_inference_steps=30, generator=generator).images[0]
  return image


In [None]:
prompt = "a photograph of an astronaut riding a horse"
imageSD = generate_SD_image(prompt)
imageOpenAI_URL = generate_OpenAI_image(prompt)

In [None]:
from PIL import Image
import urllib.request

# To show the generated images in a grid we add them to initially empty list images
images = []

# Add image generated via Stable Diffusion
images.append(imageSD)

# Add image generated via OpenAI
with urllib.request.urlopen(imageOpenAI_URL) as url:
    img=Image.open(url)
    smaller_img = img.resize((768, 768))
    images.append(smaller_img)

# Show all images from images[] in a grid
image_grid(images, 1, 2)

To learn more about Stable Diffusion versus OpenAI Images API visit [here](https://zapier.com/blog/stable-diffusion-vs-dalle/).

# Working with the Transformers Library for Different Natural Language Processing Tasks


In this section of the notebook we will use the **Transformers Library** by [Hugging Face](https://huggingface.co/). The library consists of thousands of pre-trained models many of which are trained on huge datasets for thousands of GPU hours. You can use them either directly for inference (as we will do in this lab session) or fine-tune them for your specific applications.

The HuggingFace [ModelHub](https://huggingface.co/models) consists of various pre-trained models for different tasks which can be downloaded and used easily using the Transformers Library.

##  Transformers for Translation Tasks

The easiest way to use a pre-trained model for inference is the **pipeline**. The pipeline can be used out-of-the box for many tasks across modalities (e.g., text, images, etc.). In this lab session we will look into a subset including translation and text classification.

<table>
  <tr>
    <th>Task</th>
    <th>Description</th>
    <th>Pipeline identifier</th>
  </tr>
  <tr>
    <td>Translation</td>
    <td>translate text from one language into another</td>
    <td>pipeline(task=“translation”)</td>
  </tr>
  <tr>
    <td>Text classification</td>
    <td>assign a label to a given sequence of text</td>
    <td>pipeline(task=“sentiment-analysis”)</td>
  </tr>
  <tr>
    <td>Text generation</td>
    <td>generate text that follows a given prompt</td>
    <td>pipeline(task=“text-generation”)</td>
  </tr>
  <tr>
    <td>Image Classification</td>
    <td>assign a label to an image</td>
    <td>pipeline(task=“image-classification”)</td>
  </tr>
</table>

For a comprehensive overview you can click [here](https://huggingface.co/docs/transformers/main/en/quicktour#pipeline).


As a first example we will explore the pipeline for translating text from one language to another one (use `pipeline("translation_xx_to_yy")`) For example, to translate from English to German you can use `pipeline("translation_en_to_de")`.

The pipeline downloads and caches a default pre-trained model. You can then use it on your target text, e.g., `translated_text = translator(input_text)`.  

In [None]:
from transformers import pipeline
translator = pipeline("translation_en_to_de")

input_text = "Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. Using pretrained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. "
translated_text = translator(input_text)
print(translated_text)


Rather than using a default model, we will now use a **specific model** for the translation task and pass the model, which should be used for the translation to the pipeline as parameter.

Just for translation tasks, more than 2800 pre-trained models can be found in the ModelHub which can be used together with the Transformers library (for an overview visit [here](https://huggingface.co/models?pipeline_tag=translation&library=transformers&sort=trending)).

We will use `mdl_name = "Helsinki-NLP/opus-mt-en-de"`. You can find details concerning the model on its [model card](https://huggingface.co/Helsinki-NLP/opus-mt-en-de) in the ModelHub.

In [None]:
mdl_name = "Helsinki-NLP/opus-mt-en-de"
translator = pipeline("translation", model=mdl_name)

input_text = "Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. Using pretrained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. "
translated_text = translator(input_text)
print(translated_text)

### Example: Combining Label Detection with Translation

We will now combine a translation task with the code for label detection via an existing API you already know from the previous session.

More specifically, we will use Google Cloud Vision API for detecting a label in an image and then use the Transformers library for translating the detected text.

In [None]:
# Import the libraries
from google.cloud import vision
import os
import json

In [None]:
credentials = {
##COPY the content of the JSON file here - remember you find it in Canvas##

}

json_credentials = json.dumps(credentials)

with open('My Project-543e6ed386ee.json','w') as outfile:
  outfile.write(json_credentials)

In [None]:
# Using the GOOGLE_APPLICATION_CREDENTIALS environment variable the location of a credential JSON file can be provided.
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'My Project-543e6ed386ee.json'

In [None]:
# Instantiate the client (this only works with the credantials correctly set)
client = vision.ImageAnnotatorClient()

In [None]:
# Here we use a publicly-accessible URL as image URI
# Before making the request we open the image via its uri and display it
from PIL import Image
import urllib.request

uri = 'https://www.inside-digital.de/img/whatsapp-geburtstagssprueche2.jpg?class=1200x900'
#uri = 'https://www.galaxus.ch/im/Files/2/8/7/1/1/2/6/5/959002-H-002.xxl3.jpgexportGa4PCo68TlLe9g?impolicy=ProductTileImage&resizeWidth=648&resizeHeight=486&cropWidth=648&cropHeight=486&resizeType=downsize&quality=high'

with urllib.request.urlopen(uri) as url:
    img=Image.open(url)
    display(img)

In [None]:
# Function to detect text in image
def detectTextInImage(uri):

  # Set image to be analyzed by Google Vision
  image = vision.Image()
  image.source.image_uri=uri

  response_text = client.text_detection(image=image)

  text=""
  # the if statement checks if text could be detected
  if response_text.text_annotations:
    text = response_text.text_annotations[0].description

  return text

In [None]:
from transformers import pipeline

#Translate output of label detection to english
def translateText(prompt):
  mdl_name = "Helsinki-NLP/opus-mt-de-en"
  translator = pipeline("translation", model=mdl_name)
  response = translator(prompt)
  return response

In [None]:
#Function to detect label in image and translate output of label detection
def detect_and_translate_text(uri):

  detectedText = detectTextInImage(uri)

  print(detectedText)

  response = translateText(detectedText)

  return response

In [None]:
translatedText = detect_and_translate_text(uri)

print(translatedText)

##  Transformers for Text Classification

In a next step we use the Transformers library for a text classification task (using `pipeline("sentiment-analysis")`). As model we use a specific model (i.e.,
`mdl_name = "siebert/sentiment-roberta-large-english"`), which is a fine-tuned checkpoint of a RoBERTa large model.

Again visiting ModelHub you will find multiple models suitable for classification tasks. For an overview visit [here](https://huggingface.co/models?pipeline_tag=text-classification&library=transformers&sort=trending).

In [None]:
from transformers import pipeline

def classifyText(prompt):

  mdl_name = "siebert/sentiment-roberta-large-english"
  sentiment_pipeline = pipeline("sentiment-analysis", model=mdl_name)

  return sentiment_pipeline(prompt)

In [None]:
prompt = "I am super happy"
print(prompt + ": " + str(classifyText(prompt)))

prompt = "The weather is awful today"
print(prompt + ": " + str(classifyText(prompt)))

This pipeline is very flexible. You can pass a list of prompts at a time and get multiple outputs.

In [None]:
prompts = ["I am super happy", "The weather is awful today"]
sentiments = classifyText(prompts)
i = 0

for prompt in prompts:
  print (prompt + ": " + str(sentiments[i]))
  i = i + 1

### Example: Sentiment Analysis of Twitter Data Using Python

This section is inspired by the following [Kaggle post](https://www.kaggle.com/code/kabirnagpal/vaccine-tweet-analysis-with-hugging-face).

#### Data Loading and Data Preprocessing

In [29]:
#Load the dataset (https://www.kaggle.com/datasets/gpreda/all-covid19-vaccines-tweets)
!wget https://raw.githubusercontent.com/HSG-AIML-Teaching/EMBA2024-Lab/main/lab_07/vaccination_tweets.csv

--2024-01-22 19:32:53--  https://raw.githubusercontent.com/HSG-AIML-Teaching/IEMBA2024-Lab/main/lab_07/vaccination_tweets.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4502265 (4.3M) [text/plain]
Saving to: ‘vaccination_tweets.csv’


2024-01-22 19:32:54 (57.3 MB/s) - ‘vaccination_tweets.csv’ saved [4502265/4502265]



In [30]:
# import required liabraries
import transformers
import pandas as pd
import re

In [31]:
# read data into DataFrame
df = pd.read_csv("vaccination_tweets.csv").head(100)

# drop columns which are not relevant
df.drop(["id","user_name", "user_description", "user_created", "user_followers", "user_friends", "user_favourites", "user_verified", "favorites"],axis=1,inplace=True)

# print first 5 rows
df.head()

Unnamed: 0,user_location,date,text,hashtags,source,retweets,is_retweet
0,"La Crescenta-Montrose, CA",2020-12-20 06:06:44,Same folks said daikon paste could treat a cyt...,['PfizerBioNTech'],Twitter for Android,0,False
1,"San Francisco, CA",2020-12-13 16:27:13,While the world has been on the wrong side of ...,,Twitter Web App,1,False
2,Your Bed,2020-12-12 20:33:45,#coronavirus #SputnikV #AstraZeneca #PfizerBio...,"['coronavirus', 'SputnikV', 'AstraZeneca', 'Pf...",Twitter for Android,0,False
3,"Vancouver, BC - Canada",2020-12-12 20:23:59,"Facts are immutable, Senator, even when you're...",,Twitter Web App,446,False
4,,2020-12-12 20:17:19,Explain to me again why we need a vaccine @Bor...,"['whereareallthesickpeople', 'PfizerBioNTech']",Twitter for iPhone,0,False


As we are focusing on tweets we'll extract the column "text". Let's print the first 5 tweets in the Dataset.

In [None]:
tweets = df['text'].values
tweets[:5]

Before begining with our task let's first preprocess the data to remove URLs and Emojis.

In [None]:
def data_preprocess(words):

    # removing any emojis or unknown charcters
    words = words.encode('ascii','ignore')
    words = words.decode()

    # spliting string into words
    words = words.split(' ')

    # removing URLS
    words = [word for word in words if not word.startswith('http')]
    words = ' '.join(words)

    # removing punctuations
    words = re.sub(r"[^0-9a-zA-Z]+", " ", words)

   #  removing extra spaces
    words = re.sub(' +', ' ', words)
    return words

In [None]:
# We apply the preprocessing function to every tweet and display the first 5 tweets again
tweets = [data_preprocess(tweet) for tweet in tweets]
tweets[:5]

#### Using Pipelines for Sentiment Analysis

Pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks.

The pipelines will download the pre-trained models one time and they can then be reused when ever required.

In [None]:
# create pipeline for sentiment analysis using the default model
sentiment = pipeline('sentiment-analysis')

# You can explore with alternative models to see which one works best for this particular context

## Link to ModelCard: https://huggingface.co/finiteautomata/bertweet-base-sentiment-analysis
# sentiment = pipeline('sentiment-analysis', "finiteautomata/bertweet-base-sentiment-analysis")

## Link to ModelCard: https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment
# !pip install emoji==0.6.0
# sentiment = pipeline('sentiment-analysis', "cardiffnlp/twitter-roberta-base-sentiment")

Sentiment for tweet at index 1:

In [None]:
print(tweets[1])
print(sentiment(tweets[1]))

Sentiment for tweet at index 4:

In [None]:
print(tweets[4])
print(sentiment(tweets[4]))

We can easily use the same API for batches of data as given below. This might take some time.

In [None]:
sentiment(tweets[:5])

In [None]:
# sentiments are stored in variable tweet_sentiment_data and put into a pandas DataFrame
tweet_sentiment_data = sentiment(tweets)
tweet_sentiment_data = pd.DataFrame(tweet_sentiment_data)
# Here we create another DataFrame which shows the text of the tweet along the sentiment data
tweetsWithSentiments = pd.concat([df["text"], tweet_sentiment_data],axis=1,join="outer")
tweetsWithSentiments.head()

In [None]:
# we count the different sentiments
tweetsWithSentiments['label'].value_counts()

Hence we observe the dataset has more **Negative** tweets than **Positive**.

As the API is traind on large and standardised data we can trust our predictions to a great extend. However, if you want to train for your own data you can fine-tune the model (see [here](https://medium.com/@lokaregns/fine-tuning-transformers-with-custom-dataset-classification-task-f261579ae068)).

NOTE: The score here refers to te probability of the label.

In a next step we create a histogram showing the distribution of labels.

In [None]:
import plotly.express as px

In [None]:
px.histogram(tweetsWithSentiments, x="label",nbins=100,opacity=.5,title="Tweets per Category")

#### Summarization of Tweets using Chat Completions API

In this section we combine the sentiment analysis with OpenAI's Chat Completions API to provide a summary of the tweets.

In [None]:
#Here we combine the first 25 tweets
joinedTweets = ' '.join(tweets[:25])

In [None]:
from openai import OpenAI

In [None]:
os.environ['OPENAI_API_KEY'] = 'COPY THE API_KEY HERE'
client = OpenAI()

In [None]:
# Chat Completions API used for text summarization
def summarizeTweets(joinedTweets):
  response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
      {"role": "system", "content": "Given a list of tweets, create a maximum 200 words summary ."},
      {"role": "user", "content": joinedTweets},
    ]
  )
  return response.choices[0].message.content

In [None]:
summarizeTweets(joinedTweets)

#### Bonus: Generate Word Cloud from Tweets

In this subsection we create a Word Cloud from Tweets using Python.

This section is inspired by the following [Kaggle post](https://www.kaggle.com/code/jeongbinpark/pfizer-vaccine-tweets-analysis-with-wordcloud).

In [None]:
# This piece of code takes the content of the twitter messages
text_list = df["text"].to_list()
text = ""
for i in text_list:
    text = text + i.split("https:")[0]

text = text.replace(" ",",")
text = re.sub("[\@\#\n\.\…\?\\\'\d\)\(\%\*]", ",", text)
text = re.sub(",{2,}", ",", text)

text[:1000]

In [None]:
# Split into words
text = text.split(',')
text[:10]

In [None]:
# Remove stop words

import nltk
from nltk.corpus import stopwords

nltk.download('stopwords')

def stop_w(x):
    new_s = []
    for i in text:
        if i.lower() not in stopwords.words("english"):
            new_s.append(i.lower())
    return new_s

text = stop_w(text)

In [None]:
# create word cloud

from wordcloud import WordCloud
import matplotlib.pyplot as plt

text_count = pd.Series(text).value_counts()
wc = WordCloud(width=1000, height=600, background_color="white", random_state=0)
plt.figure(figsize=(20,10),facecolor='w')
plt.imshow(wc.generate_from_frequencies(text_count))
plt.axis("off")
plt.show()