<h1>App-Klassifikation für Life Sciences</h1>

**Ziel:**
Baue einen einfachen Python-Klassifikator in einem Low-Resource-Setting, um Apps zu identifizieren, die für die Branchen Life Science, MedTech und Pharma relevant sein könnten oder welche Apps GxP relevant sind.

**Anforderungen:**

- Verwende ein Jupyter Notebook.
- Nutze die im Anhang bereitgestellten Beispieldaten als Grundlage.
- Entwickle mindestens zwei Klassifikationsmethoden (z. B. einfache ML-Modelle, Vektoransätze).
- Annotiere und prüfe deine Daten.
- Visualisiere deine Ergebnisse (z. B. Konfusionsmatrix, Embedding-Plots, Feature-Importance).
- Gib die wichtigsten Merkmale für deine Klassifikation an.
- Erkläre deine Vorgehensweise und Methodenwahl.

**Optionale Tools:**

- pandas, numpy, scikit-learn, weaviate
- fasttext, word2vec

**Zusätzliche Empfehlungen:**

- Nutze ggf. Transfer Learning oder Embedding-Techniken, um mit wenigen Beispielen zu arbeiten.
- Begründe deine Modellwahl (z. B. warum ein bestimmter Klassifikator für dieses Setting geeignet ist).
- Achte auf Nachvollziehbarkeit und Reproduzierbarkeit deines Codes.

In [7]:
# imported necessary libraries
from google_play_scraper import search, app # Life Sciences App Scraper using google-play-scraper
from app_store_scraper import AppStore # Life Sciences App Scraper using apple-scraper
import pandas as pd
import time # to avoid anti-scraper
import requests
# 
# tf-idf to collect keywords
from sklearn.feature_extraction.text import TfidfVectorizer
from nltk.corpus import stopwords
import nltk
nltk.download('stopwords')
#prune the keyword list
import fasttext

# gensim Word to vector
from nltk.tokenize import word_tokenize
from gensim.models import Word2Vec

# 
nltk.download('punkt')

#filter out ill formed data
import re

# for data annotation
import ipywidgets as widgets
from IPython.display import display, clear_output

[nltk_data] Downloading package stopwords to /home/keen/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /home/keen/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [8]:
#apple version: collect data from apple store, using keyword list:
def fetch_apple_store_apps(keywords, lang='en', country='US', file_suffix='round0'):
    """
    from Apple App Store, fetch data 

    Parameters:
    1, keywords: 
    2, lang: default as 'en'
    3, country: default as 'US'
    4, file_suffix: csv file name: default as'round0'

    return:
    1, pandas DataFrame
    """
    app_data = []

    for kw in keywords:
        print(f"Searching Apple Store for: {kw}")
        try:
            url = f"https://itunes.apple.com/search?term={kw}&entity=software&country={country}&lang={lang}&limit=50"
            response = requests.get(url)
            data = response.json()

            if data['resultCount'] == 0:
                print(f"  No results for keyword: {kw}")
                continue

            for app in data['results']:
                app_data.append({
                    "trackId": app.get("trackId"),
                    "title": app.get("trackName"),
                    "description": app.get("description"),
                    "genre": app.get("primaryGenreName"),
                    "developer": app.get("sellerName"),
                    "url": app.get("trackViewUrl"),
                    "rating": app.get("averageUserRating"),
                    "ratingCount": app.get("userRatingCount"),
                    "keyword": kw
                })
            print(f"  ✓ Found {len(data['results'])} apps for keyword '{kw}'")
        except Exception as e:
            print(f"  × Error searching keyword {kw}: {e}")
        time.sleep(1)

    # create DataFrame,remove duplication
    if app_data:
        df = pd.DataFrame(app_data)
        df = df.drop_duplicates(subset=['trackId'])

        # construct csv file name
        filename = f"{file_suffix}_apple_{lang}.csv"
        df.to_csv(filename, index=False)
        print(f"✅ Saved data to {filename}, total apps after deduplication: {len(df)}")
        return df
    else:
        print("⚠️ No data collected.")
        return pd.DataFrame()


In [15]:
#google version: collect data from google store, using keyword list:
def fetch_google_play_apps(keywords, lang='en', country='us', file_suffix='round0'):
    """
    from google play, fetch data 

    Parameters:
    1, keywords: 
    2, lang: default as 'en'
    3, country: default as 'US'
    4, file_suffix: csv file name: default as'round0'

    return:
    1, pandas DataFrame
    """
    app_data = []

    for kw in keywords:
        print(f"Searching Google Play for: {kw}")
        try:
            results = search(kw, lang=lang, country=country, n_hits=50)
        except Exception as e:
            print(f"  × Search failed for keyword '{kw}': {e}")
            continue

        for res in results:
            try:
                app_info = app(res['appId'], lang=lang, country=country)
                app_data.append({
                    "appId": app_info.get("appId"),
                    "title": app_info.get("title"),
                    "description": app_info.get("description"),
                    "genre": app_info.get("genre"),
                    "score": app_info.get("score"),
                    "installs": app_info.get("installs"),
                    "developer": app_info.get("developer"),
                    "url": app_info.get("url"),
                    "keyword": kw
                })
            except Exception as e:
                print(f"  × Error fetching details for appId '{res.get('appId', 'N/A')}': {e}")
            time.sleep(1)  # avoid to be blocked

    # get DataFrame, remove duplication
    if app_data:
        df = pd.DataFrame(app_data)
        df = df.drop_duplicates(subset=['appId'])

        # construct csv file name
        filename = f"{file_suffix}_googleplay_{lang}.csv"
        df.to_csv(filename, index=False)
        print(f"✅ Saved data to {filename}, total apps after deduplication: {len(df)}")
        return df
    else:
        print("⚠️ No data collected.")
        return pd.DataFrame()


In [4]:
initial_keywords = ["Life Science", "MedTech", "pharma", "GxP"]
df_round_0_google = fetch_google_play_apps(initial_keywords, lang='en', country='us', file_suffix='round0')
df_round_0_apple = fetch_apple_store_apps(initial_keywords, lang='en', country='us', file_suffix='round0')

Searching Google Play for: Life Science
Searching Google Play for: MedTech
Searching Google Play for: pharma
Searching Google Play for: GxP
✅ Saved data to round0_googleplay_en.csv, total apps after duplication: 118
Searching Apple Store for: Life Science
  ✓ Found 50 apps for keyword 'Life Science'
Searching Apple Store for: MedTech
  ✓ Found 41 apps for keyword 'MedTech'
Searching Apple Store for: pharma
  ✓ Found 46 apps for keyword 'pharma'
Searching Apple Store for: GxP
  ✓ Found 7 apps for keyword 'GxP'
✅ Saved data to round0_apple_en.csv, total apps after deduplication: 141


In [5]:
def extract_new_kw_tfidf(df, text_column='description', existing_keywords=None, lang='english', max_features=100):
    """
    extract TF-IDF keywords from DataFrame's description, return a keyword list remove the existing keywords

    parameters:
    1, df: pandas DataFrame
    2, text_column: for extraction, default as 'description'
    3, existing_keywords: optional, if already exist some keywords
    4, lang: for stopwords, default as 'english'
    5, max_features: at most perserved feature number, default as 100

    return:
    1, new_keywords: list
    """
    # 1. check 
    if existing_keywords is None:
        existing_keywords = []

    # 2. create TfidfVectorizer
    vectorizer = TfidfVectorizer(
        max_features=max_features,
        stop_words=stopwords.words(lang)
    )

    # 3. dump nan data
    X = vectorizer.fit_transform(df[text_column].dropna())

    # 4. get feature
    tfidf_keywords = vectorizer.get_feature_names_out()

    # 5. remove duplication
    new_keywords = list(set(tfidf_keywords) - set(existing_keywords))

    # 6. print first 100 keyworda
    print(f"✅ Extracted {len(new_keywords)} new keywords (top 100 shown):")
    print(new_keywords[:100])

    return new_keywords


In [6]:
keywords_google_gensim = extract_new_kw_tfidf(df_round_0_google,'description',initial_keywords,'english')
keywords_apple_gensim = extract_new_kw_tfidf(df_round_0_apple,'description',initial_keywords,'english')
print(keywords_google_gensim)
print(keywords_apple_gensim)

✅ Extracted 99 new keywords (top 100 shown):
['clinical', 'order', 'support', 'care', 'using', 'search', 'treatment', 'including', 'www', 'https', 'take', 'save', 'gps', 'papers', 'exams', 'medical', 'find', 'time', 'employee', 'drug', 'help', 'one', 'medicine', 'use', 'hotels', 'rewards', 'medicines', 'comprehensive', 'professionals', 'best', 'terms', 'app', 'every', 'exam', 'available', 'information', 'anywhere', 'com', 'offers', 'offline', 'drugs', 'real', 'whether', 'biology', 'like', 'plan', 'test', 'user', 'download', 'topics', 'view', 'need', 'maps', 'store', 'pharmacy', 'choose', 'get', 'us', 'life', 'tools', 'access', 'questions', 'healthcare', 'easily', 'prescription', 'features', 'mobile', 'experience', 'li', 'products', 'learning', 'based', 'learn', 'subscription', 'news', 'content', 'stay', 'practice', 'easy', 'new', 'key', 'make', 'online', 'card', 'manage', 'management', 'also', 'science', 'provides', 'track', 'data', 'free', 'study', 'application', 'knowledge', 'health'

In [7]:
clean_keywords_google_gensim = ['medication','medicine','pharmacy','clinical','healthcare','prescription','medical','biology','science']
clean_keywords_apple_gensim = ['pharmacy','medtech','healthcare','medical','health','pharmaceutical']
clean_keywords_combine_gensim =  ['medication','medicine','pharmacy','clinical','healthcare','prescription','medical','biology','science','medtech','pharmaceutical']

In [8]:
def extract_expand_kw_word2vec(df, text_column='description',
                                       keywords_to_check=None,
                                       lang='english',
                                       vector_size=50, window=5,
                                       min_count=1, workers=4,
                                       topn=10):
    """
    train Word2Vec model, get similar words for each keywords,return model and all similar word list

    parameters:
    1, df: DataFrame
    2, text_column: default as description
    3, keywords_to_check: keyword list
    4, lang: stopword language 'english'
    5, vector_size, window, min_count, workers: Word2Vec parameter of the model 
    6, topn: number of similar words, default as 10

    return:
    1, model: trained Word2Vec model
    2, expanded_keywords: list[str]
    """

    print("📘 preparing stopwords and data...")
    stop_words = set(stopwords.words(lang))
    corpus = []
    for doc in df[text_column].dropna():
        tokens = word_tokenize(str(doc).lower())
        filtered_tokens = [w for w in tokens if w.isalpha() and w not in stop_words]
        if filtered_tokens:
            corpus.append(filtered_tokens)

    print(f"✅ preprocess finished, {len(corpus)} corpus in total.")

    print("🚀 trainning Word2Vec model...")
    model = Word2Vec(sentences=corpus,
                     vector_size=vector_size,
                     window=window,
                     min_count=min_count,
                     workers=workers)
    print("✅ training completed")

    all_words = [w for doc in corpus for w in doc]
    expanded_keywords = set()

    if keywords_to_check:
        for kw in keywords_to_check:
            print(f"\n🔍 keywords:'{kw}'")
            count = all_words.count(kw)
            print(f"  appearing time: {count}")
            try:
                similar_words = model.wv.most_similar(kw, topn=topn)
                print("similar words:")
                for word, score in similar_words:
                    print(f"    {word} ({score:.2f})")
                    expanded_keywords.add(word)
                expanded_keywords.add(kw)  # added original keywords
            except KeyError:
                print(f"⚠️ '{kw}' not included in the glossary")

    # transfor as list
    expanded_keywords = sorted(list(expanded_keywords))
    print(f"\n📦 return keywords number: {len(expanded_keywords)} ")

    return model, expanded_keywords

# prune by hand

In [9]:
model_google,keywords_google_w2v = extract_expand_kw_word2vec(df_round_0_google, text_column='description',
                                       keywords_to_check=["medical"],
                                       lang='english',
                                       vector_size=50, window=5,
                                       min_count=1, workers=4,
                                       topn=30)
model_apple,keywords_apple_w2v = extract_expand_kw_word2vec(df_round_0_apple, text_column='description',
                                       keywords_to_check=["medical"],
                                       lang='english',
                                       vector_size=50, window=5,
                                       min_count=1, workers=4,
                                       topn=20)

📘 preparing stopwords and data...
✅ preprocess finished, 118 corpus in total.
🚀 trainning Word2Vec model...
✅ training completed

🔍 keywords:'medical'
  appearing time: 134
similar words:
    app (0.97)
    test (0.97)
    access (0.96)
    get (0.96)
    care (0.96)
    pharmacy (0.96)
    find (0.95)
    exam (0.95)
    medicine (0.95)
    use (0.95)
    learn (0.95)
    help (0.95)
    students (0.95)
    features (0.95)
    manage (0.95)
    information (0.95)
    life (0.95)
    time (0.95)
    also (0.95)
    free (0.95)
    maps (0.95)
    drug (0.95)
    online (0.95)
    healthcare (0.94)
    need (0.94)
    practice (0.94)
    health (0.94)
    like (0.94)
    questions (0.94)
    search (0.94)

📦 return keywords number: 31 
📘 preparing stopwords and data...
✅ preprocess finished, 140 corpus in total.
🚀 trainning Word2Vec model...
✅ training completed

🔍 keywords:'medical'
  appearing time: 66
similar words:
    test (0.55)
    medtech (0.52)
    e (0.50)
    new (0.50)
    d

In [10]:
print(keywords_google_w2v)
print(keywords_apple_w2v)

['access', 'also', 'app', 'care', 'drug', 'exam', 'features', 'find', 'free', 'get', 'health', 'healthcare', 'help', 'information', 'learn', 'life', 'like', 'manage', 'maps', 'medical', 'medicine', 'need', 'online', 'pharmacy', 'practice', 'questions', 'search', 'students', 'test', 'time', 'use']
['app', 'check', 'data', 'discover', 'e', 'easy', 'fee', 'intentional', 'learn', 'learning', 'manage', 'medical', 'medtech', 'new', 'play', 'posiłkowa', 'random', 'test', 'time', 'topics', 'visualmente']


In [11]:
df_round_1_google = fetch_google_play_apps(clean_keywords_combine_gensim, lang='en', country='us', file_suffix='round1')
df_round_1_apple = fetch_apple_store_apps(clean_keywords_combine_gensim, lang='en', country='us', file_suffix='round1')

Searching Google Play for: medication
Searching Google Play for: medicine
Searching Google Play for: pharmacy
Searching Google Play for: clinical
Searching Google Play for: healthcare
Searching Google Play for: prescription
Searching Google Play for: medical
Searching Google Play for: biology
Searching Google Play for: science
Searching Google Play for: medtech
Searching Google Play for: pharmaceutical
✅ Saved data to round1_googleplay_en.csv, total apps after duplication: 246
Searching Apple Store for: medication
  ✓ Found 51 apps for keyword 'medication'
Searching Apple Store for: medicine
  ✓ Found 55 apps for keyword 'medicine'
Searching Apple Store for: pharmacy
  ✓ Found 51 apps for keyword 'pharmacy'
Searching Apple Store for: clinical
  ✓ Found 50 apps for keyword 'clinical'
Searching Apple Store for: healthcare
  ✓ Found 53 apps for keyword 'healthcare'
Searching Apple Store for: prescription
  ✓ Found 53 apps for keyword 'prescription'
Searching Apple Store for: medical
  ✓ F

In [12]:
clean_keywords_combine =  ["Life Science", "MedTech", "pharma", "GxP",'medication','medicine','pharmacy','clinical','healthcare','prescription','medical','biology','science','medtech','pharmaceutical']
df_done_google = fetch_google_play_apps(clean_keywords_combine, lang='en', country='us', file_suffix='done')
df_done_apple = fetch_apple_store_apps(clean_keywords_combine, lang='en', country='us', file_suffix='done')

Searching Google Play for: Life Science
Searching Google Play for: MedTech
Searching Google Play for: pharma
Searching Google Play for: GxP
Searching Google Play for: medication
Searching Google Play for: medicine
Searching Google Play for: pharmacy
Searching Google Play for: clinical
Searching Google Play for: healthcare
Searching Google Play for: prescription
Searching Google Play for: medical
Searching Google Play for: biology
Searching Google Play for: science
Searching Google Play for: medtech
Searching Google Play for: pharmaceutical
✅ Saved data to done_googleplay_en.csv, total apps after duplication: 318
Searching Apple Store for: Life Science
  ✓ Found 50 apps for keyword 'Life Science'
Searching Apple Store for: MedTech
  ✓ Found 41 apps for keyword 'MedTech'
Searching Apple Store for: pharma
  ✓ Found 46 apps for keyword 'pharma'
Searching Apple Store for: GxP
  ✓ Found 7 apps for keyword 'GxP'
Searching Apple Store for: medication
  ✓ Found 51 apps for keyword 'medication'


In [13]:
'''
# show all col
pd.set_option('display.max_columns', None)

# row
pd.set_option('display.max_rows', None)

# width
pd.set_option('display.max_colwidth', None)
'''


# display setting
pd.reset_option("display.max_rows")

In [14]:
import re

def filter_google_play_links(csv_path, column_name='trackId', output_path=None):
    """
    read CSV, keep the data that fits Google Play's format

    parameters
    1, csv_path (str): CSV file path
    2, column_name (str)
    3, output_path (str, optional)

    return:
    4, pd.DataFrame: clean DataFrame
    """
    df = pd.read_csv(csv_path)

    # define Google Play link's regulation
    pattern = re.compile(r'^https:\/\/play\.google\.com\/store\/apps\/details\?id=[\w\.]+(?:&[\w=]*)*$')

    # filter using bool mask
    mask = df[column_name].apply(lambda x: bool(pattern.match(str(x).strip())))
    df_filtered = df[mask].copy()

    # store the clean data
    if output_path:
        df_filtered.to_csv(output_path, index=False)
        print(f"✅ Filtered file saved to: {output_path}")

    print(f"✅ {len(df_filtered)} valid Google Play links retained.")
    return df_filtered


In [15]:
anno_df_google = filter_google_play_links("done_googleplay_en.csv", column_name='url', output_path='clean_google_en.csv')

✅ Filtered file saved to: clean_google_en.csv
✅ 318 valid Google Play links retained.


In [16]:
def filter_apple_links(csv_path, column_name='url', output_path=None):
    """
    read CSV, keep the data that fits https://apps.apple.coms format

    parameters
    1, csv_path (str): CSV file path
    2, column_name (str)
    3, output_path (str, optional)

    return:
    4, pd.DataFrame: clean DataFrame
    """
    df = pd.read_csv(csv_path)

    # check if col exist
    if column_name not in df.columns:
        raise ValueError(f"❌ file name'{column_name}' not exist. CSV's columns' name includes{df.columns.tolist()}")

    # regulation
    pattern = re.compile(r'^https:\/\/apps\.apple\.com')

    # filter
    mask = df[column_name].apply(lambda x: bool(pattern.match(str(x).strip())))
    df_filtered = df[mask].copy()

    # store the result csv file
    if output_path:
        df_filtered.to_csv(output_path, index=False)
        print(f"✅ store Apple clean data to: {output_path}")

    print(f"✅ in total {len(df_filtered)} Apple corpus")
    return df_filtered


In [17]:
anno_df_apple = filter_apple_links("done_apple_en.csv", column_name='url', output_path='clean_apple_en.csv')

✅ store Apple clean data to: clean_apple_en.csv
✅ in total 546 Apple corpus


In [18]:
###annotation by hand:

In [5]:
def launch_annotation_tool(df=None, csv_path=None, text_column='description', label_column='label', output_path='annotated_output.csv'):
    # load data
    if df is None:
        if csv_path is None:
            raise ValueError("Please provide df or csv_path")
        df = pd.read_csv(csv_path)
    else:
        df = df.copy()

    if label_column not in df.columns:
        df[label_column] = ""

    df.reset_index(drop=True, inplace=True)
    pending_df = df[df[label_column] == ""].reset_index()

    if len(pending_df) == 0:
        print("✅ all data annotated! ")
        return

    current_idx = 0

    # def button
    btn_0 = widgets.Button(description="0 ❌ not relevant", button_style='danger')
    btn_1 = widgets.Button(description="1 ✅ relevant", button_style='success')
    output = widgets.Output()

    def show_annotation(idx):
        with output:
            clear_output(wait=True)
            if idx >= len(pending_df):
                print("✅ finished, storing...")
                df.to_csv(output_path, index=False)
                print(f"✅ stored to: {output_path}")
                return
            row_idx = pending_df.loc[idx, 'index']
            text = df.loc[row_idx, text_column]
            print(f"📄 No. {idx + 1}  / in total {len(pending_df)} \n")
            print(text)

    def on_button_clicked(label_value):
        nonlocal current_idx
        if current_idx >= len(pending_df):
            return
        row_idx = pending_df.loc[current_idx, 'index']
        df.at[row_idx, label_column] = label_value
        current_idx += 1
        show_annotation(current_idx)

    btn_0.on_click(lambda b: on_button_clicked(0))
    btn_1.on_click(lambda b: on_button_clicked(1))

    display(widgets.HBox([btn_0, btn_1]), output)
    show_annotation(current_idx)


In [6]:
launch_annotation_tool(csv_path='clean_google_en.csv', output_path='annotated__google.csv')

HBox(children=(Button(button_style='danger', description='0 ❌ not relevant', style=ButtonStyle()), Button(butt…

Output()

In [3]:
def interactive_label_review(
    csv_path,
    text_column='description',
    label_column='label',
    output_path='corrected_output.csv'
):
    """
    loop through csv file, check if there is sth need to be modify, and save.
    """
    with open('annotated__google.csv', encoding='utf-8', errors='ignore') as f:
        df = pd.read_csv(f)
    current = {'i': 0}

    btn_keep = widgets.Button(description="Keep the same")
    btn_set_0 = widgets.Button(description="modify to 0", button_style='danger')
    btn_set_1 = widgets.Button(description="modify to 1", button_style='success')
    btn_save = widgets.Button(description="save result", button_style='info')

    output_area = widgets.Output()

    def show_item():
        with output_area:
            clear_output(wait=True)
            if current['i'] >= len(df):
                print("✅ finished loop through")
                print("Please click save botton to save the modification.")
                return
            try:
                text = df.loc[current['i'], text_column]
                label = df.loc[current['i'], label_column]
            except Exception as e:
                print(f"❌ error while reading data {e}")
                return

            print(f"📄 No. {current['i'] + 1}  / in total {len(df)} ")
            print(f"text context:\n{text}\n")
            print(f"current label: {label}")
            print("Please choose: ")

    def keep_label(b):
        current['i'] += 1
        show_item()

    def set_label_0(b):
        df.at[current['i'], label_column] = 0
        current['i'] += 1
        show_item()

    def set_label_1(b):
        df.at[current['i'], label_column] = 1
        current['i'] += 1
        show_item()

    def save_result(b):
        df.to_csv(output_path, index=False)
        with output_area:
            print(f"✅ modification is saved to {output_path}")

    btn_keep.on_click(keep_label)
    btn_set_0.on_click(set_label_0)
    btn_set_1.on_click(set_label_1)
    btn_save.on_click(save_result)

    buttons_box = widgets.HBox([btn_keep, btn_set_0, btn_set_1, btn_save])

    display(output_area, buttons_box)
    show_item()




In [4]:
interactive_label_review('annotated__google.csv', text_column='description', label_column='label')


Output()

HBox(children=(Button(description='Keep the same', style=ButtonStyle()), Button(button_style='danger', descrip…

In [17]:
test_keywords = [ "ISO 9001","ISO 13485","ISO 27001","EU-GDPR"]
df_GxP_google = fetch_google_play_apps(test_keywords, lang='en', country='us', file_suffix='test')

Searching Google Play for: ISO 9001
Searching Google Play for: ISO 13485
Searching Google Play for: ISO 27001
Searching Google Play for: EU-GDPR
  × Search failed for keyword 'EU-GDPR': 'ds:4'
✅ Saved data to test_googleplay_en.csv, total apps after deduplication: 59


In [12]:
launch_annotation_tool(csv_path='test_googleplay_en.csv', output_path='annotated_test_google_GxP.csv')

HBox(children=(Button(button_style='danger', description='0 ❌ not relevant', style=ButtonStyle()), Button(butt…

Output()

In [14]:
f1 = pd.read_csv('annotated_test_google_GxP.csv')
# f2 = pd.read_csv('annotated__google.csv')
with open('annotated__google.csv', encoding='utf-8', errors='ignore') as f:
    f2 = pd.read_csv(f)   
merged = pd.concat([f1, f2], ignore_index=True)
merged.to_csv('annotated__google.csv', index=False)