# Theme Frequency Processing 

### Step 1: Clean Interview Text → List of Responses

In [25]:
import re
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

from sklearn.feature_extraction.text import CountVectorizer
from collections import Counter

plt.style.use('ggplot')
sns.set(font_scale=1.2)


In [None]:

def clean_interview_text(raw_text):
    # Split based on numbered questions (1.) 2.) etc.)
    raw_sentences = re.split(r"\n?\d+\.\)", raw_text)
    
    # Remove extra whitespace and empty strings
    cleaned = [s.strip() for s in raw_sentences if s.strip()]
    
    return cleaned

# for Original Fantuan App 
# Interview Responses
original_raw_text = """
1.) easy, not hard to do. Annoying thing was the information overload. VIP, ads. That was overwhelming. Other than that, the actual action was fine.  1.5 

2.)  Smilar, given I use UBER eats Familiar to me. food types are small, restuarutants icons are too small. I don't like scrolling to much 

3.) It's too vertical. I barely got to see any food. I want to see the customization options BEFORE I add it to my cart. So I can get it. 3.5

4.) Nothing really. Again, show me customization methods  before I add. 

5.) relatively easy to select a place. The address already being there helped. Way too many pop-up and screens. having to work with a new screen was annoying. Too many screens. 4.5 

6.) The sizing of the elemnts are strange. Why are the food type icons too small, while the checkout is good. The sizing isn't very homogenous. Too many pages. "
"1.) 2. the icon for Korean food was a little ahrd to find. 
2.) Layout was fine. List format was ok, the fact that they're all scrollable in one long scroll. 

3.) 6. Ordering the dish was fine, customizing was confusing, I couldn't ell which ones were customizable. It just tells me to add it to cart. NOt crippling, but slightly annoying. 
4.) consufion about how I can customize stuff was annoying. Having a custmization button would be good, beyond that just being told to add stuff to cart. 

5.) 1. Since it was already filled out. 
6.) NOthing comes to mind. "
"1.)  7. Made me feel stupid. I selected the korean section, it gave me chinese restaurants. 
2.) It's ok. I don't like how much I have to scroll. 

3.) 2. Customizng is really nice. It shows me all my options, and all I got to do is click on it. It's all very readable. THIS is all assuming the app actually works. 
4.) Advantage over the redesign, it can show off the food alot better. I find the full-width banner images for showing off the food is a better fit. I hate this app. 

5.) 2. It was very easy to select an address, them using geographical data was really helpful as well. ASAP delivery time is default, so I didn't even need to select it. 

6.) If it works, it's good. But when it doesn't, it's EXTREMLY frustring to use. Working meaning long load times, or the buttons being unresponsive. "
"1.) 6, it wasn't straightforward and the interface was very complicated. 
2.) Very crowded and colorful so it was kinda eye-straining.

3.) 7, very hard and complicated. Especially the customizing part was very challenging.
4.) No, not really.

5.) 2, it was easier than the rest of the tasks.
6.) I hated the pop-up ads. They were very annoying. Also the overall app was very confusing and crowded. "
"1.) 5, since it wasn't easy to find a korean restaurant.
2.) Although I chose a korean restaurant, the reastaurant that came up were japanese restaurants, whgich was very frustrating.

3.) 8, I did not like it at all, it was very overwhelming.
4.) When I selected the dish, the app asked me to put my phone number and my email address which was very unnnecessary at that stage.

5.) 4, it wasn't that complecated but still it was very crowded.
6.) I wish the app wasn't this colorful and messy. 
"""

# for Team's Redsigned prototype
# Interview Responses
redesigned_raw_text = """
1.) 1 (being easy). Options were streamlined. Sizing was more homogneous. 
2.) Don't like my address at the top. Don't like the deals section being on the main menu, I prefer it being on the restauratn menu, so it feels like I have a choice of what I'm ordering. 

3.) 2. I like that I can see the options before I added it to my cart. But how would this work on a phone? Woldn't this squish the options? The idea is nice, but I'm worried about the introduction. 
4.) I would prefer only seeing the dish I am customizing. Why am I seeing the other ones? 

5.) 7 as in difficult. In the app, it assumed earliet delivery time. What's the pin? I was unsure of what delivery time was. I would prfer seeing the options one at a time, all at once is overwhleming. This feels unnatural. THe optional nature of coupons and merchant message should be designed around. not a fan of the current 9 box layout. 

6.) The box system feels like it's priming me to pick one box. This sort of translates over to the delivery menu. The devliery options have a chronologically, but how do I figure out that chronology? the boxes lack chronology, I don't know the order in which I need to complete them. LAcking that chronology is an issue, what if one affects the choices of the other. I still have no idea which ones are optinal or not.  I prefer uber eats delivery menu (I've been using it for a long time). "
"1.) 2. IT felt the same as last time
2.) no, not really 
3.) 3. it was easier to see the customization options. The font was bad. 
4.) Font was annoying. 

5.) 5. Some of the information felt like it shold have been autofilled. What the heck was a pin? 
6.) Overall, the 2 dsigns were really comparable. "
"1.) 3. The restaurant doesn't immeditaely appear to me as korean. I had to search for the work korean 
2.) Pretty good. Everything is salient. It's hard to identify what nationarlity the restaurant belongs to . I ogtta read the names. 

3.) 2. the UI is tiny. There's all this space. Customization menu itself was very compact. Making it larger would be better. I'm unsure if this interface would fit on a phone. The position of the customiation menu wouldn't fit on the food. 
4.) Menu is nice. Menu was fast. Spacing is good for food. Naming of dishes is good. 

5.) 2. Why does the coupon button break the pattern established by the previous buttons? Same for the utensils button. The plus button is nice, but it would make snese to tap directly on the box itself. Place button is cute. 

6.) Progress bar is cool. The minimal design is cool. I'd be totally ok with this interface. Put icons in the delivery section to explain what each fieldd represents. Like a credit-card icon for the payment method section.  the place of the dish customization menu is not good. "
"1.) 2, it was pretty easy and straighforwared.
2.) Everything is pretty basic which makes evrything much more easier so I liked the simplicity.

3.) 2, again it was pretty easy.
4.) The font of customize section was pretty small so it was kinda hard to read.

5.) 1, since the info was already written.
6.) Overall I think it looks great, it's very easy to use and I really liked the idea of progress bar instead of scrolling forever :)"
"1.) 1, very easy.
2.) It's intuitive, clear

3.) 3, the font wasn't easy to read
4.) I think reading the extra toppings part was a little hard but not terrible. 

5.) 2, becuase I didn't need to write anything
6.) I think you dfid a great job, I would use this app instead of the original one for sure.
"""

original_responses = re.split(r'\d\.\)', original_raw_text)
redesigned_responses = re.split(r'\d\.\)', redesigned_raw_text)

# Clean and remove empty strings and extra spaces
cleaned_original_responses = [resp.strip().replace('\n', ' ') for resp in original_responses if resp.strip()]
cleaned_redesigned_responses = [resp.strip().replace('\n', ' ') for resp in redesigned_responses if resp.strip()]



### Step 2: Preprocessing


In [None]:
# Download stopwords if not already downloaded
nltk.download('stopwords')
nltk.download('punkt')

# Load stopwords
stop_words = stopwords.words('english')

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/Fanyiling/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /Users/Fanyiling/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


### Step 3: Step 1: Vectorize Responses (Bag of Words)

In [9]:
def vectorize_responses(responses):
    vectorizer = CountVectorizer(stop_words=stop_words, max_features=50)
    X = vectorizer.fit_transform(responses)
    word_freq = X.toarray().sum(axis=0)
    words = vectorizer.get_feature_names_out()
    freq_df = pd.DataFrame({'Word': words, 'Frequency': word_freq})
    freq_df = freq_df.sort_values(by='Frequency', ascending=False).reset_index(drop=True)
    return freq_df


### Steo 4: Run Frequency Analysis

In [10]:
original_freq = vectorize_responses(cleaned_original_responses)
redesigned_freq = vectorize_responses(cleaned_redesigned_responses)


### ???

In [14]:
def get_word_frequencies(text_list):
    tokens = []
    for sentence in text_list:
        words = word_tokenize(sentence.lower())
        words = [w for w in words if w.isalpha() and w not in stop_words]
        tokens.extend(words)
    return Counter(tokens)


In [21]:
original_freq = get_word_frequencies(original_responses)
redesigned_freq = get_word_frequencies(redesigned_responses)

print(original_freq.most_common(10))
print(redesigned_freq.most_common(10))

LookupError: 
**********************************************************************
  Resource [93mpunkt_tab[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  [31m>>> import nltk
  >>> nltk.download('punkt_tab')
  [0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load [93mtokenizers/punkt_tab/english/[0m

  Searched in:
    - '/Users/Fanyiling/nltk_data'
    - '/Library/Frameworks/Python.framework/Versions/3.13/nltk_data'
    - '/Library/Frameworks/Python.framework/Versions/3.13/share/nltk_data'
    - '/Library/Frameworks/Python.framework/Versions/3.13/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - '/Users/Fanyiling/nltk_data'
**********************************************************************
