# 🧾 6.1 Introduction to Qualitative Research

This notebook explores qualitative methods for nutrition studies, such as analysing food preference surveys.

**Objectives**:
- Compare quantitative and qualitative approaches.
- Analyse textual survey data.
- Apply basic text processing techniques.

**Context**: Qualitative data, like hippo food preferences, provides rich insights into dietary behaviours. 🦛

<details><summary>Fun Fact</summary>
Qualitative data is like a hippo’s diary—full of stories and nuances! 🦛
</details>

In [None]:
# Setup for Google Colab: Fetch datasets automatically or manually
import os
from google.colab import files

# Define the module and dataset for this notebook
MODULE = '06_qualitative'  # e.g., '01_infrastructure'
DATASET = 'food_preferences.txt'  # e.g., 'hippo_diets.csv'
BASE_PATH = '/content/data-analysis-toolkit-FNS'
MODULE_PATH = os.path.join(BASE_PATH, 'notebooks', MODULE)
DATASET_PATH = os.path.join('data', DATASET)

# Step 1: Attempt to clone the repository (automatic method)
# Note: If you encounter a cloning error (e.g., 'fatal: destination path already exists'),
#       reset the runtime (Runtime > Restart runtime) and run this cell again.
try:
    print('Attempting to clone repository...')
    if os.path.exists(BASE_PATH):
        print('Repository already exists, skipping clone.')
    else:
        !git clone https://github.com/ggkuhnle/data-analysis-toolkit-FNS.git
    
    # Debug: Print directory structure
    print('Listing repository contents:')
    !ls {BASE_PATH}
    print(f'Listing notebooks directory contents:')
    !ls {BASE_PATH}/notebooks
    
    # Check if the module directory exists
    if not os.path.exists(MODULE_PATH):
        raise FileNotFoundError(f'Module directory {MODULE_PATH} not found. Check the repository structure.')
    
    # Set working directory to the notebook's folder
    os.chdir(MODULE_PATH)
    
    # Verify dataset is accessible
    if os.path.exists(DATASET_PATH):
        print(f'Dataset found: {DATASET_PATH} 🦛')
    else:
        print(f'Error: Dataset {DATASET} not found after cloning.')
        raise FileNotFoundError
except Exception as e:
    print(f'Cloning failed: {e}')
    print('Falling back to manual upload option...')

    # Step 2: Manual upload option
    print(f'Please upload {DATASET} manually.')
    print(f'1. Click the "Choose Files" button below.')
    print(f'2. Select {DATASET} from your local machine.')
    print(f'3. Ensure the file is placed in notebooks/{MODULE}/data/')
    
    # Create the data directory if it doesn't exist
    os.makedirs('data', exist_ok=True)
    
    # Prompt user to upload the dataset
    uploaded = files.upload()
    
    # Check if the dataset was uploaded
    if DATASET in uploaded:
        with open(DATASET_PATH, 'wb') as f:
            f.write(uploaded[DATASET])
        print(f'Successfully uploaded {DATASET} to {DATASET_PATH} 🦛')
    else:
        raise FileNotFoundError(f'Upload failed. Please ensure you uploaded {DATASET}.')

# Install required packages for this notebook
%pip install pandas numpy
print('Python environment ready.')

In [1]:
# Install required packages
%pip install pandas  # For Colab users
import pandas as pd
print('Qualitative analysis environment ready.')

## Data Preparation

Load `food_preferences.txt`, a survey of hippo food preferences.

In [2]:
with open('data/food_preferences.txt', 'r') as f:
    responses = f.readlines()
print(responses[:2])

['Hippo H1: I enjoy crunchy carrots for their sweetness.\n', 'Hippo H2: Grass is acceptable, but fruit is preferred.\n']


## Basic Text Analysis

Count the frequency of words like ‘carrots’ or ‘fruit’ in the responses.

In [3]:
word_counts = {'carrots': 0, 'fruit': 0}
for response in responses:
    for word in word_counts:
        word_counts[word] += response.lower().count(word)
print(f'Word counts: {word_counts}')

Word counts: {'carrots': 10, 'fruit': 15}


## Exercise 1: Expand Analysis

Add another word (e.g., ‘vegetables’) to the count and describe its prevalence in a Markdown cell.

**Answer**:

The word ‘vegetables’ appears...

## Conclusion

You’ve explored qualitative data analysis with text processing. Next, dive deeper into text analysis in 6.2.

**Resources**:
- [Python Text Processing](https://docs.python.org/3/library/string.html)
- [Qualitative Research Guide](https://www.qualitative-research.net/)
- Repository: [github.com/ggkuhnle/data-analysis-toolkit-FNS](https://github.com/ggkuhnle/data-analysis-toolkit-FNS)