# Sentiment Analysis of the Public's Opinnion on DNA Fingerprinting using the Twitter API

**Goals**
- To analyze the public's opinnion on DNA fingerprinting using publically avaliable tweets. 

**Useful Links**
- [Flowchart Overview](https://lucid.app/lucidchart/647ae773-5d75-4902-b463-f4ebaefc6a0e/edit?viewport_loc=-95%2C-163%2C2524%2C1568%2C0_0&invitationId=inv_c9da80f4-0c98-494f-98fa-f0a84c45c683)

## Install Dependencies
Using `tweepy`, `gspread`, `colab-env`, `demoji`, and `emoji` python libraries.

References
- [tweepy](https://www.tweepy.org/): Used to query the Twitter API. 
- [gspread](https://docs.gspread.org/en/latest/): API for Google Spreadsheets.
- [colab-env](https://pypi.org/project/colab-env/): Used for environment variables in Google Colab notebooks. 
- [demoji](https://pypi.org/project/demoji/): Removes and replaces emojis in text strings. 

In [1]:
!pip install tweepy
!pip install --upgrade gspread
!pip install colab-env --upgrade
!pip install demoji
!pip install textblob

Collecting gspread
  Downloading gspread-4.0.1-py3-none-any.whl (29 kB)
Installing collected packages: gspread
  Attempting uninstall: gspread
    Found existing installation: gspread 3.0.1
    Uninstalling gspread-3.0.1:
      Successfully uninstalled gspread-3.0.1
Successfully installed gspread-4.0.1
Collecting colab-env
  Downloading colab-env-0.2.0.tar.gz (4.7 kB)
Collecting python-dotenv<1.0,>=0.10.0
  Downloading python_dotenv-0.19.1-py2.py3-none-any.whl (17 kB)
Building wheels for collected packages: colab-env
  Building wheel for colab-env (setup.py) ... [?25l[?25hdone
  Created wheel for colab-env: filename=colab_env-0.2.0-py3-none-any.whl size=3836 sha256=bfd6c33a7f78a802d9e8196cee3ee78ed6326d656781975859d8a8b5b41c454d
  Stored in directory: /root/.cache/pip/wheels/bb/ca/e8/3d25b6abb4ac719ecb9e837bb75f2a9b980430005fb12a9107
Successfully built colab-env
Installing collected packages: python-dotenv, colab-env
Successfully installed colab-env-0.2.0 python-dotenv-0.19.1
Collect

## Importing Dependencies
*Also downloading information needed for `nltk` and `demoji`*

In [16]:
'''import dependencies'''
import math 
import tweepy
import pandas as pd
import colab_env
import gspread
from oauth2client.client import GoogleCredentials
from google.colab import auth
import os
import demoji
import re
import nltk
from textblob import TextBlob
from nltk.sentiment import SentimentIntensityAnalyzer

nltk.download([
     "names",
     "stopwords",
     "state_union",
     "twitter_samples",
     "movie_reviews",
     "averaged_perceptron_tagger",
     "vader_lexicon",
     "punkt",
     "brown"])

demoji.download_codes()

[nltk_data] Downloading package names to /root/nltk_data...
[nltk_data]   Package names is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package state_union to /root/nltk_data...
[nltk_data]   Package state_union is already up-to-date!
[nltk_data] Downloading package twitter_samples to /root/nltk_data...
[nltk_data]   Package twitter_samples is already up-to-date!
[nltk_data] Downloading package movie_reviews to /root/nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   



## Authorize Google Spreadsheets

In [3]:
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

## Setting Environment Variables
*Used to set environmnet variables for Twitter API keys.*

[Google's Reference for Setting Environment Variables](https://colab.research.google.com/github/apolitical/colab-env/blob/master/colab_env_testbed.ipynb#scrollTo=PgQHBQfJDMMb)

In order to get your API information, apply for a developer account [here](https://developer.twitter.com/en/apply-for-access)

In [4]:
colab_env.envvar_handler.add_env("API_KEY", "<YOUR API KEY>", overwrite=True)
colab_env.envvar_handler.add_env("API_SECRET_KEY", "<Your API SECRET KEY>", overwrite=True)
colab_env.envvar_handler.add_env("BEARER_TOKEN ", "<YOUR BEARER TOKEN>", overwrite=True)
colab_env.envvar_handler.add_env("ACCESS_TOKEN", "<YOUR ACCESS TOKEN>", overwrite=True)
colab_env.envvar_handler.add_env("ACCESS_TOKEN_SECRET", "<YOUR ACCESS TOKEN SECRET>", overwrite=True)

## Query the Twitter API


### Overview
1. Read keywords from the `keywords` sheet in the [[DATA] Public Sentiment on DNA Fingerprinting](https://docs.google.com/spreadsheets/d/1ern1BJY6qs84ZkLWYGsw02n2UL5tv_uyavh7VIKrDrg/edit?usp=sharing) workbook. 
2. Query the Twitter API for 100 English tweets for each keyword in the spreadsheet before a certain date. The date is changed every day to query for different tweets. 
3. Populate each raw tweet into their corrosponding date sheet. 

### Documentation

#### Usage of `class Searcher`
```python
API_Searcher = Searcher(<workbook_name>, <sheet_name>, <cell_range>, <api_config>)
API_Searcher.search_api(until=<YYYY-MM-DD>)
API_Searcher.populate_tweets(<result_sheet_name>)
```
###### `__init__()` arguments
- `workbook_name` - string. name of the Google Spreadsheet file that you are working on. Ex: `"[DATA] Public Sentiment on DNA Fingerprinting"`
- `sheet_name` - string. name of the sheet within `workbook_name` that the keywords to search for are on. Ex: `"keywords"`
- `cell_range` - string. cell range that the keywords are located. Blank cells are allowed and will automatically be skipped after the first blank cell. Ex: `"C1:C1000"`

###### `API_Searcher.search_api()` arguments
- `until` - string. Returns tweets created before the given date. Date should be formatted as YYYY-MM-DD. Keep in mind that the search index has a 7-day limit. In other words, no tweets will be found for a date older than one week. Ex: `"2021-07-31"`. [Tweepy Docs](https://docs.tweepy.org/en/stable/api.html#search-tweets)

###### `API_Searcher.populate_tweets()` arguments
- `sheet_name` - string. Name of the sheet in `workbook` to populate tweets found from the Twitter API into. 

### References
- [Twitter API Docs](https://developer.twitter.com/en/docs)
- [Tweepy API Docs](https://docs.tweepy.org/en/stable/api.html)
- [GSpread Docs](https://docs.gspread.org/en/latest/user-guide.html#creating-a-worksheet)

In [5]:
'''Read keywords from the keywords sheet in the [DATA] Public Sentiment on DNA Fingerprinting workbook.'''

class Searcher():
  def __init__(self, workbook, sheet, range, api_config):
    self.workbook = workbook
    self.sheet = sheet
    self.range = range
    self._api_config = api_config
    self.api = self._create_api()
    self.keywords = self._read_keywords()
    self.word_to_tweets = {}
    
  def _create_api(self): 
    auth = tweepy.OAuthHandler(self._api_config['apiKey'], self._api_config['apiSecretKey'])
    auth.set_access_token(self._api_config['accessToken'], self._api_config['accessTokenSecret'])
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api
      
  def _read_keywords(self):
    # function reads a cell range in google sheets and returns a python list
    workbook = gc.open(self.workbook) # workbooks
    # save sheets
    worksheet = workbook.worksheet(self.sheet)
    # read data from sheets
    cell_list = worksheet.range(self.range)

    keywords = self._read_cells(cell_list)
    return keywords 

  def _read_cells(self, cell_list):
    # reads the cells in the cell_list
    lst = []
    for cell in cell_list:
      if cell.value:
        lst.append(cell.value) # cells with values are added to list
      else:
        break # empty cells are not read
    return lst

  def _convert_tweets(self, word, tweets):
    self.word_to_tweets[word] = []

    for tweet in tweets:
      if 'retweeted_status' in tweet._json:
        text = tweet._json['retweeted_status']['full_text']
      else:
        text = tweet.full_text

      if text in self.word_to_tweets[word]:
        # self.word_to_tweets[word].append("")
        pass
      else:
        self.word_to_tweets[word].append(text)


  def search_api(self, until):
    for word in self.keywords:
      tweets = self.api.search(word, count=100, tweet_mode='extended', lang='en', until=until) # list of tweets (text acessible with tweet.text)
      self._convert_tweets(word, tweets) # add tweets to dictionary mapping keyword to list of tweets
      print("done: " + word + "...")
    
  def populate_tweets(self, sheet_name):
    workbook = gc.open(self.workbook) # entire workbook
    try:
      worksheet_to_delete = workbook.worksheet(sheet_name)
      workbook.del_worksheet(worksheet_to_delete)
    except:
      # worksheet does not exist, create a new one
      pass
    worksheet = workbook.add_worksheet(title=sheet_name, rows="1000", cols=str(len(self.keywords)))
    dataframe = pd.DataFrame.from_dict(self.word_to_tweets, orient='index')
    dataframe = dataframe.transpose()
    worksheet.update([dataframe.columns.values.tolist()] + dataframe.values.tolist())


In [6]:
api_config = {
    "apiKey": os.getenv("API_KEY"),
    "apiSecretKey": os.getenv("API_SECRET_KEY"),
    "bearerToken": os.getenv("BEARER_TOKEN"),
    "accessToken": os.getenv("ACCESS_TOKEN"),
    "accessTokenSecret": os.getenv("ACCESS_TOKEN_SECRET"),
}

In [7]:
dates = ['2021-11-03'] # change this line to query for different dates
dates_to_sheet_name = {
    date : date[5:] + "_raw" for date in dates
}

for date in dates:
  API_Searcher = Searcher('[DATA] Public Sentiment on DNA Fingerprinting', 'keywords', 'B4:B1000', api_config)
  API_Searcher.search_api(until=date)
  API_Searcher.populate_tweets(dates_to_sheet_name[date])
  print("--------------stats for", date, "--------------")
  for key in API_Searcher.word_to_tweets:
    print(len(API_Searcher.word_to_tweets[key]), "tweets for", key)
  print("")

done: dna fingerprinting...
done: dna fingerprint...
done: genetic fingerprinting...
done: genetic fingerprint...
done: dna identification...
done: dna profiling...
done: dna profile...
done: dna typing...
done: genetic profile...
done: genetic profiling...
--------------stats for 2021-11-03 --------------
37 tweets for dna fingerprinting
49 tweets for dna fingerprint
4 tweets for genetic fingerprinting
7 tweets for genetic fingerprint
67 tweets for dna identification
28 tweets for dna profiling
45 tweets for dna profile
13 tweets for dna typing
46 tweets for genetic profile
12 tweets for genetic profiling



## Data Merging & Cleaning
As a result of the Querying the Twitter API, there are many different sheets created that each represent a timespan in which tweets were pulled back from. In order to perform sentiment analysis on the entire dataset, all tweets need to first be merged into a comon spredsheet and be cleaned. 

### Overview of Data Merging
1. Go through each sheet that ends in `_raw`, and collect all the tweets in each column into a dictionary. Duplicate tweets are ignored (since the different queries through time spans may have overlap). 
2. Populate the combined data into a new sheet called `merged_tweets`.

### Overview of Data Cleaning
After data merging, each tweet in the `merged_tweets` spreadsheet is cleaned. Data cleaning removes the unimportant filler words, as well as converts other parts of typical English text into information that sentiment analyzers can understand. 
1. All image URLs are replaced with `_IMG`
2. All regular URLS are replaced with `_URL`
3. All emojis are replaced with their textual descriptions (`🙂` → `Slightly Smiling Face`)
4. All repeated letters are replaced with just 2 letters of the same repeated character (`heeeeeeeello` → `heello`)

Example Tweet:
- Before Cleaning: 
```We 💚love💚 these photos of some very impressive students learning gel electrophoresis and DNA profiling ... in first year! 🤯 Thank you for sharing the photos @GoreyEtss. We're looking forward to seeing what these scientists do next! #BiotechExperience @ABEProgOffice https://t.co/idow3wAkSd```
- After Cleaning:
```: green heart : love : green heart : photos impressive students learning gel electrophoresis DNA profiling .. first year ! : exploding head : Thank sharing photos @ GoreyEtss . 're looking forward seeing scientists next ! #BiotechExperience @ ABEProgOffice _IMAGE```

### Documentation
#### Usage of `class Cleaner`
```python
CleanerObj = Cleaner()
CleanerObj.clean_text(<text>)
```
###### `clean_text()` arguments
- `text` - string. the text that will be put through the cleaning process mentioned above.

#### Usage of `class Tweet_Cleaner`
```python
test = Tweet_Cleaner(workbook=<workbook_name>, match_phrase=<match_phrase>, result_sheet=<merge_result_sheet>, clean_sheet=<clean_tweet_sheet>)
test.merge_tweets()
test.clean_tweets()
```

- `workbook_name` - string. name of the workbook containing tweets.
- `match_phrase` - string. last ending few letters that will be matched against to figure out which sheets contain raw tweets.
- `merge_result_sheet` - string. name of the spreadsheet to populate all of the tweets from each time span of raw tweets to. 
- `clean_sheet` - string. name of the spreadsheet to output all of the cleaned tweets to. 





In [8]:
class Cleaner:
  def replace_emoji_and_dup(self, text):
    '''
    function takes in string `text` parameter and removes all duplicate characters in a row that are greater than 2 
    -replaces the emojis with their descriptions
    '''
    tracker = {}
    final = []

    # check if the past two characters were that same char

    for i in range(len(text)):
      if i < 2:
        final.append(text[i])
        continue
      if (text[i-1] == text[i]) and (text[i-2] == text[i]):
        pass
      else:
        final.append(text[i])

    final = "".join(final)
    return demoji.replace_with_desc(final)

  # Example 

  def replace_image(self, text):
    '''
    function takes a `string` parameter text and replaces all image URLS with `_IMAGE` 
    '''
    return re.sub(r"https://t.co/(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\), ]|(?:%[0-9a-fA-F][0-9a-fA-F]))\w+", "_IMAGE", text)
    

  def replace_url(self, text):
    '''
    function takes a `string` parameter text and replaces all URLS with `_URL` 
    '''
    return re.sub(r"https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)", ' _URL', text)


  def remove_stopwords(self, text):
    stopwords = nltk.corpus.stopwords.words("english")
    tweet_words = nltk.word_tokenize(text)
    words = [w for w in tweet_words if w.lower() not in stopwords]
    clean_tweet = " ".join(words)
    return clean_tweet

  def clean_text(self, text):
    '''
    MASTER FUNCTION
    '''
    # remove emojis and duplicate chars
    cleaned = self.replace_emoji_and_dup(text)

   

    # replace image links with _IMAGE
    cleaned = self.replace_image(cleaned)

    # remove stopwords
    cleaned = self.remove_stopwords(cleaned)

    # replace links with _URL
    cleaned = self.replace_url(cleaned)

    return cleaned

In [9]:
class Tweet_Cleaner():
    def __init__(self, workbook, match_phrase, result_sheet, clean_sheet):
      self.workbook = gc.open(workbook)
      self.match_phrase = match_phrase
      self.result_sheet = result_sheet
      self.raw_tweet_sheets = self._get_raw_tweet_sheets() # list
      self.word_to_tweets = {}
      self.clean_sheet = clean_sheet

    def _get_raw_tweet_sheets(self):
      # adds all worksheets with tweets that need to be cleaned into a list
      valid_sheets = []
      worksheet_list = self.workbook.worksheets()
      for sheet in worksheet_list:
        # if sheet.title[-4:] == self.match_phrase:
        if sheet.title[-len(self.match_phrase):] == self.match_phrase:
          valid_sheets.append(sheet)
      return valid_sheets

    def merge_tweets(self):
      for sheet in self.raw_tweet_sheets:
        list_of_dicts = sheet.get_all_records()
        self._add_to_data(list_of_dicts)
      self._populate_tweets(self.result_sheet, self.word_to_tweets)

    def _add_to_data(self, list_of_dicts):
      # self.word_to_tweets should contain key -> all the tweets from raw sheets
      for dictionary in list_of_dicts:
        for key in dictionary:
          # key is dna_fingerprinting, as an example
          if dictionary[key] == "":
            continue
          if key in self.word_to_tweets:
            # append item to list, if not blank
              if dictionary[key] in self.word_to_tweets[key]:
                pass
              else:
                self.word_to_tweets[key].append(dictionary[key])
          else:
              self.word_to_tweets[key] = [dictionary[key]]
    def _populate_tweets(self, sheet_name, data):
      try:
        worksheet_to_delete = self.workbook.worksheet(sheet_name)
        self.workbook.del_worksheet(worksheet_to_delete)
      except:
        # worksheet does not exist, create a new one
        pass
      worksheet = self.workbook.add_worksheet(title=sheet_name, rows="1000", cols=str(len(data.keys())))
      dataframe = pd.DataFrame.from_dict(data, orient='index')
      dataframe = dataframe.transpose()
      worksheet.update([dataframe.columns.values.tolist()] + dataframe.values.tolist())

    def clean_tweets(self):
      CleanerObj = Cleaner()
      for key in self.word_to_tweets:
        for i in range(len(self.word_to_tweets[key])):
          cleaned = CleanerObj.clean_text(self.word_to_tweets[key][i])
          self.word_to_tweets[key][i] = cleaned
      self._populate_tweets(self.clean_sheet, self.word_to_tweets)
      
          


In [10]:
test = Tweet_Cleaner(workbook="[DATA] Public Sentiment on DNA Fingerprinting", match_phrase="_raw", result_sheet="merged_tweets", clean_sheet="cleaned_tweets")
test.merge_tweets()
test.clean_tweets()

## Sentiment Analysis

After populating all cleaned tweets into a spreadsheet, it is possible to perform sentiment analysis on the data. 

### Overview of Sentiment Analysis
1. Go through each tweet under each keyword, and get the sentiment score from that tweet using the [TextBlob](https://textblob.readthedocs.io/en/dev/) Python library. 
2. Store each sentiment and subjectivity score into a list
3. Find the average of each list in order to determine the average sentiment for each keyword
4. Output the average sentiment for each keyword to the console.
5. Grab the [noun phrases](https://textblob.readthedocs.io/en/dev/quickstart.html#noun-phrase-extraction) and populate them into another sheet.

### Sentiment, Subjectivity, and Noun Phrases
- **Sentiment**: A float between -1.0 - 1.0 in which -1.0 represents the most negative langauge while 1.0 represents the most positive langauge. 
- **Subjectivity**: A float between 0.0 - 1.0 in which a higher value represents more subjective text. 
- **Noun Phrases**: List of nouns within a tweet

### Documentation
Usage of `class Tweet_Analyzer`

```py
Tweet_Machine = Tweet_Analyzer("<workbook_name>", "<sentiment_sheet_name>")
Tweet_Machine.analyze_sentiment()
Tweet_Machine.get_most_used_keywords(result_sheet="<phrase_result_sheet_name>")
```

- `workbook_name` - string. name of the workbook containing tweets.
- `sentiment_sheet_name` - string. name of the worksheet containing the tweets that you want to perform sentiment analysis on.
- `phrase_result_sheet_name` - string. name of the worksheet to populate all noun phrases onto

In [44]:
class Tweet_Analyzer():
  def __init__(self, workbook, sheet_name):
    self.workbook = gc.open(workbook)
    self.worksheet = self.workbook.worksheet(sheet_name)
    self.word_to_tweets = {}
    self.sentiment_results = {} # keyword to each sentiment score
    self.subjectivity_results = {} # higher subj = more likely to be an opinnion; lower subj = morel likely to be factual info
    self.noun_lists = {}

  def analyze_sentiment(self):
    sia = SentimentIntensityAnalyzer()
    self._get_all_tweets() # populates self.word_to_tweets
    # go through each key and each list within that key to conduct sentiment analysis
    for keyword in self.word_to_tweets:
      # self.word_to_tweets[key] is a list of all tweets
      for tweet in self.word_to_tweets[keyword]:
        tweet_analyze = TextBlob(tweet)
        score = sia.polarity_scores(str(tweet))
        self.sentiment_results[keyword].append(score['compound'])
        self.subjectivity_results[keyword].append(tweet_analyze.sentiment.subjectivity)

    print("results for sentiment results:")
    
    overall_sentiment = self._get_final_results(self.sentiment_results)
    df_1 = pd.DataFrame.from_dict(overall_sentiment, orient="index")
    print(df_1)
    
    print()

    print("results for subjectivity results:")
    overall_subjectivity = self._get_final_results(self.subjectivity_results)
    df_2 = pd.DataFrame.from_dict(overall_subjectivity, orient="index")
    print(df_2)
    
  def _get_final_results(self, dictionary):
    results = {}
    for keyword in dictionary:
      total = sum(dictionary[keyword])
      count = len(dictionary[keyword])
      results[keyword] = round(total/count, 2)
    return results

  def _get_all_tweets(self):
    list_of_dicts = self.worksheet.get_all_records()
    self._add_to_data(list_of_dicts)

  def _add_to_data(self, list_of_dicts):
    # self.word_to_tweets should contain key -> all the tweets from cleaned sheet
    for dictionary in list_of_dicts:
      for key in dictionary:
        # key is dna_fingerprinting, as an example
        if dictionary[key] == "":
          continue
        if key in self.word_to_tweets:
          # append item to list, if not blank
            if dictionary[key] in self.word_to_tweets[key]:
              pass
            else:
              self.word_to_tweets[key].append(dictionary[key])
        else:
            self.word_to_tweets[key] = [dictionary[key]]
            self.sentiment_results[key] = []
            self.subjectivity_results[key] = []

  def get_most_used_keywords(self, result_sheet):
    self._get_all_tweets()
    for keyword in self.word_to_tweets:
      # self.word_to_tweets[key] is a list of all tweets
      for tweet in self.word_to_tweets[keyword]:
        tweet_analyze = TextBlob(tweet)
        tweet_noun_phrases = list(tweet_analyze.noun_phrases)
        tweet_noun_phrases = self._clean_phrases(tweet_noun_phrases)
        self._add_to_noun_lists(keyword, tweet_noun_phrases)
    # print(self.noun_lists) # noun lists maps keyword to a list of all noun phrases from all tweets in that category
    self._add_to_sheet(result_sheet)

  def _add_to_sheet(self, result_sheet):
    try:
      worksheet_to_delete = self.workbook.worksheet(result_sheet)
      self.workbook.del_worksheet(worksheet_to_delete)
    except:
      # worksheet does not exist, create a new one
      pass
    
    keys = self.noun_lists.keys()
    worksheet = self.workbook.add_worksheet(title=result_sheet, rows="1000", cols=str(len(keys)))
    dataframe = pd.DataFrame.from_dict(self.noun_lists, orient='index')
    dataframe = dataframe.transpose()
    worksheet.update([dataframe.columns.values.tolist()] + dataframe.values.tolist())


  def _add_to_noun_lists(self, keyword, lst):
    if keyword in self.noun_lists:
      self.noun_lists[keyword].extend(lst)
    else:
      self.noun_lists[keyword] = lst

  def _clean_phrases(self, lst):
    new_phrases = []
    for item in lst:
      if item[0] != "@":
        new_phrases.append(item)
    return new_phrases
  

In [46]:
Tweet_Machine = Tweet_Analyzer("[DATA] Public Sentiment on DNA Fingerprinting", "cleaned_tweets")
Tweet_Machine.analyze_sentiment()
Tweet_Machine.get_most_used_keywords(result_sheet="phrases")

results for sentiment results:
                           0
dna fingerprint         0.06
genetic fingerprint     0.13
dna identification      0.04
dna profiling           0.02
dna fingerprinting      0.10
dna profile             0.09
genetic fingerprinting -0.07
dna typing              0.05
genetic profile         0.10
genetic profiling       0.08

results for subjectivity results:
                           0
dna fingerprint         0.39
genetic fingerprint     0.40
dna identification      0.40
dna profiling           0.38
dna fingerprinting      0.35
dna profile             0.41
genetic fingerprinting  0.37
dna typing              0.43
genetic profile         0.40
genetic profiling       0.45
