[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/TCU-DCDA/WRIT20833-2025/blob/main/notebooks/codeAlongs/Lists-Loops-Colab.ipynb)

# The Ways We Count and Are Counted
## Lists, List Methods, and Loops in Digital Humanities

Welcome to our exploration of how we organize, count, and analyze information in the digital age. Today we'll learn about Python lists - one of the fundamental ways computers help us make sense of collections of data.

In digital humanities, we're constantly working with collections: books in a library, words in a text, people in a census, artifacts in a museum. Lists help us organize and analyze these collections systematically.

But our title hints at something deeper: **the ways we count AND the ways we are counted**. As we learn to organize and analyze cultural data, we must also consider how people, communities, and ideas are represented, categorized, and quantified in digital collections. Who gets counted? What gets measured? How do our analytical choices shape what we discover about culture and society?

### Understanding Data Organization: From Spreadsheets to Code

Before we dive into lists, let's connect Python data structures to something familiar - spreadsheets and tables:

- **Variables** = **Individual table cells**: Just like a single cell holds one piece of information (a name, date, or number), a variable stores one value
- **Lists** = **Table rows or columns**: Just as a row might contain all information about one person, or a column might contain all the ages in a dataset, a list stores multiple related items in order
- **Dictionaries** *(coming later)* = **Complete tables**: Like a spreadsheet with both rows and columns, dictionaries can store complex, structured information

Think of today's lesson as learning to work with a single row or column from a research spreadsheet - but with the power to analyze thousands of items automatically!

## Part 1: What Are Lists?

Think of a list as a digital filing cabinet or a library catalog. Just like physical collections, our digital lists maintain order and allow us to access, organize, and analyze items systematically.

Let's start with a simple example from literary studies:

In [None]:
# A list of Shakespeare's major tragedies
shakespeare_tragedies = ["Hamlet", "Othello", "King Lear", "Macbeth"]

print("Shakespeare's Major Tragedies:")
print(shakespeare_tragedies)

### Try it yourself:
Create a list of your favorite books, historical periods, or research topics:

In [None]:
# Your turn: create a list related to your interests
my_list = []  # Replace with your own list

print(my_list)

## Part 2: Accessing Information - How We Find What We're Looking For

Just like finding a book on a library shelf, we can locate specific items in our lists. Python uses "indexing" - think of it like seat numbers or page numbers.

In [None]:
# Historical periods
time_periods = ["Ancient", "Medieval", "Renaissance", "Modern", "Contemporary"]

# Remember: Python starts counting from 0 (like ground floor = 0 in some countries)
print(f"First period: {time_periods[0]}")
print(f"Third period: {time_periods[2]}")
print(f"Last period: {time_periods[-1]}")  # -1 gives us the last item

### Discussion Question:
Why might it matter that computers start counting from 0? How does this relate to how humans typically count and organize information?

In [None]:
# Try accessing different items from your list above
# What happens if you try to access an index that doesn't exist?

## Part 2.5: List Slicing - Extracting Portions of Collections

Just as scholars might focus on specific chapters of a book, decades in a timeline, or particular regions in a geographic study, we often need to work with portions of our data collections rather than entire lists. List slicing is like creating focused excerpts from larger datasets.

Think of slicing as creating targeted subcollections - perhaps you want to analyze only the first half of a century's literature, or examine the last few entries in a historical record, or study every third artifact in a museum collection.

### The Anatomy of Slicing: [start:stop:step]

List slicing uses the pattern `[start:stop:step]` where:
- **start**: where to begin (included)
- **stop**: where to end (excluded - like "until" rather than "through")
- **step**: how many items to skip (default is 1)

This flexible system allows us to extract exactly the data we need for analysis.

## Part 3: List Methods - Tools for Digital Scholarship

Just as scholars have developed methods for analyzing texts and historical sources, Python provides us with methods for working with lists. These are like specialized tools in a researcher's toolkit.

## Part 3.5: Conditional Logic with Lists - Making Decisions About Data

In humanities research, we constantly make decisions about our sources: Is this document from the right time period? Does this text contain the themes we're studying? Is this artifact from the culture we're examining?

Conditional logic - using `if`, `elif`, and `else` statements - allows us to program these kinds of scholarly decisions. Combined with comparison and logical operators, we can systematically filter, categorize, and analyze our collections based on specific criteria.

### The Building Blocks of Digital Decision-Making

**Comparison Operators** - for evaluating relationships:
- `==` (equal to), `!=` (not equal to)
- `>`, `<`, `>=`, `<=` (greater than, less than, etc.)

**Logical Operators** - for combining conditions:
- `and` (both conditions must be true)
- `or` (either condition can be true)  
- `not` (reverses the condition)
- `in`, `not in` (checking membership in collections)

**String Methods for Text Analysis**:
- `.startswith()`, `.endswith()` - for pattern matching
- `.replace()` - for cleaning and standardizing data
- `.strip()` - for removing unwanted whitespace

These tools let us ask complex questions of our data: "Show me all authors from the 20th century whose names start with 'V'" or "Find all documents that contain both 'democracy' and 'freedom' but not 'war'."

In [None]:
# Let's work with a list of authors we're studying
authors = ["Virginia Woolf", "James Joyce", "T.S. Eliot"]

print("Original list:", authors)
print(f"Number of authors: {len(authors)}")

### Adding to Our Collection - .append() and .extend()

In [None]:
# Adding one author to our collection
authors.append("Ezra Pound")
print("After adding Ezra Pound:", authors)

# Adding multiple authors at once
new_authors = ["Gertrude Stein", "William Faulkner"]
authors.extend(new_authors)
print("After extending with new authors:", authors)

### Finding Information - .index() and .count()

In [None]:
# Where in our list is a specific author?
woolf_position = authors.index("Virginia Woolf")
print(f"Virginia Woolf is at position: {woolf_position}")

# Let's create a list with some repeated words from a text analysis
words_in_text = ["love", "death", "time", "love", "memory", "death", "love"]
love_count = words_in_text.count("love")
print(f"The word 'love' appears {love_count} times")

### Organizing Our Data - .sort() and .reverse()

In [None]:
# Let's organize our authors alphabetically
print("Before sorting:", authors)
authors.sort()
print("After sorting:", authors)

# Or in reverse order
authors.reverse()
print("After reversing:", authors)

### Try It Yourself: Building a Research Collection

In [None]:
# Create a list of research topics, historical events, or literary works
research_topics = []

# Add some initial topics


# Add more topics using extend()


# Sort your topics


# Print the final organized list
print("My research topics:", research_topics)

## Part 4: Loops - Systematic Analysis

In traditional scholarship, we might read through a stack of documents one by one, taking notes on each. Loops let us do this systematically with digital collections. This is where "counting" becomes powerful analysis.

### The Simple For Loop - Processing Each Item

In [None]:
# Let's analyze a collection of historical documents
documents = ["Letter to John Adams", "Diary Entry March 1776", "Speech to Continental Congress"]

print("Analyzing documents:")
for document in documents:
    print(f"- Processing: {document}")

### Loops with Enumeration - Keeping Track of Position

In [None]:
# Sometimes we need to know both the item and its position
manuscript_pages = ["Title Page", "Preface", "Chapter 1", "Chapter 2", "Bibliography"]

print("Manuscript Structure:")
for position, page in enumerate(manuscript_pages, 1):  # Start counting from 1
    print(f"Page {position}: {page}")

### Conditional Analysis - Filtering Our Data

In [None]:
# Let's find all the long titles in our collection
book_titles = [
    "1984",
    "To Kill a Mockingbird", 
    "One Hundred Years of Solitude",
    "Pride and Prejudice",
    "The Great Gatsby"
]

print("Books with titles longer than 15 characters:")
for title in book_titles:
    if len(title) > 15:
        print(f"- {title} ({len(title)} characters)")

## Part 4.5: Building and Filtering Lists - Constructing New Collections

One of the most powerful aspects of digital humanities work is the ability to create new collections from existing ones based on specific criteria. This is like a scholar going through a library's entire collection and creating a specialized bibliography of works that meet certain requirements.

### The Pattern: Empty List + Loop + Conditional Append

This fundamental pattern appears constantly in digital humanities research:
1. Start with an empty list to hold our results
2. Loop through our source collection
3. Check each item against our criteria
4. Add items that match to our new collection

This approach allows us to:
- Extract all works by female authors from a general literature list
- Find all historical events from a specific century
- Identify all artworks from a particular cultural movement
- Clean datasets by removing incomplete or problematic entries

### Data Cleaning in Digital Humanities

Real-world cultural data is often messy. Historical records might have uncertain attributions marked with "(?)", incomplete information, or inconsistent formatting. Learning to systematically clean and standardize data is crucial for reliable analysis.

This involves:
- Removing unwanted characters or markers
- Standardizing spelling and formatting
- Handling missing or uncertain information
- Creating consistent categories for analysis

## Part 5: Counting and Being Counted - Digital Humanities Applications

Now let's explore how these tools help us understand how people and ideas are represented in digital collections.

### Word Frequency Analysis

In [None]:
# A simple text analysis - let's look at word frequency
# This could be from a historical speech, poem, or document
sample_text = ["freedom", "liberty", "justice", "freedom", "equality", "justice", "freedom", "democracy"]

print("Word Frequency Analysis:")
unique_words = []

for word in sample_text:
    if word not in unique_words:
        unique_words.append(word)
        count = sample_text.count(word)
        print(f"'{word}' appears {count} times")

## Part 5.5: Advanced List Tools - Professional Digital Humanities Techniques

As digital humanities projects grow in complexity, we need more sophisticated tools for managing and analyzing our collections. Python provides several advanced functions that make common research tasks more efficient and elegant.

### enumerate() - Keeping Track of Position and Content

When analyzing collections, we often need to know both what something is and where it appears. Think of a scholar who needs to cite not just a quotation, but its page number; or a curator who must track both an artifact and its catalog number.

The `enumerate()` function provides a systematic way to track position alongside content, essential for:
- Creating numbered bibliographies or catalogs
- Tracking line numbers in text analysis
- Maintaining source citations with positional references
- Creating structured reports with sequential numbering

### zip() - Combining Related Collections

In digital humanities, we often work with related but separate datasets - like having lists of authors, publication dates, and genres that all correspond to the same books. The `zip()` function allows us to work with these parallel collections simultaneously.

This is particularly valuable for:
- Combining biographical data (names, birth years, nationalities)
- Linking texts with their metadata (titles, authors, publication info)
- Connecting artifacts with their provenance information
- Merging historical events with their dates and locations

### Counter - Systematic Frequency Analysis

The `Counter` tool from Python's collections library automates one of the most common tasks in digital humanities: counting how often things appear. Instead of manually tallying occurrences, `Counter` provides professional-grade frequency analysis.

This is essential for:
- Word frequency studies in literary analysis
- Tracking recurring themes across collections
- Analyzing demographic patterns in historical data
- Identifying the most and least common elements in any dataset

### List Comprehensions - Elegant Filtering and Transformation

List comprehensions provide a concise, readable way to create new collections from existing ones. They're like writing a mathematical set notation for data processing - expressing complex filtering or transformation operations in a single, clear statement.

This advanced technique allows experienced practitioners to:
- Create filtered collections more efficiently
- Transform data while applying conditions
- Write more readable and maintainable code
- Combine multiple operations into elegant one-liners

### Analyzing Representation in Collections

In [None]:
# Let's analyze gender representation in a literary syllabus
syllabus_authors = [
    ("Virginia Woolf", "female"),
    ("James Joyce", "male"),
    ("Zora Neale Hurston", "female"),
    ("William Faulkner", "male"),
    ("Toni Morrison", "female"),
    ("Ernest Hemingway", "male")
]

male_count = 0
female_count = 0

print("Analyzing representation in our syllabus:")
for author, gender in syllabus_authors:
    print(f"- {author} ({gender})")
    if gender == "male":
        male_count += 1
    elif gender == "female":
        female_count += 1

total_authors = len(syllabus_authors)
print(f"\nSummary:")
print(f"Total authors: {total_authors}")
print(f"Male authors: {male_count} ({male_count/total_authors*100:.1f}%)")
print(f"Female authors: {female_count} ({female_count/total_authors*100:.1f}%)")

## Part 6: Your Turn - A Mini Research Project

Now it's time to apply what you've learned. Choose one of these mini-projects or create your own:

## Part 6.5: Walsh Textbook-Style Practice Exercises

These exercises mirror the style and requirements of the Walsh "Intro to Cultural Analytics" textbook chapters 8, 9, and 10. They focus on practical data manipulation tasks commonly encountered in digital humanities research, particularly working with historical and demographic data.

### Data Cleaning and Filtering Exercises

These exercises practice the essential skill of preparing messy, real-world data for analysis - a constant challenge in digital humanities work.

**Exercise Focus Areas:**
- Removing uncertainty markers (like "(?)" from historical records)
- Filtering collections based on specific criteria
- Creating new collections from existing ones using conditional logic
- Cleaning and standardizing text data for analysis

### Indexing and Enumeration Exercises

These exercises develop skills for systematic data organization and presentation, crucial for creating professional research outputs.

**Exercise Focus Areas:**
- Using `enumerate()` to create numbered lists and catalogs
- Combining position information with content analysis
- Creating structured reports with sequential numbering
- Managing both content and positional metadata

### Multi-List Analysis Exercises

These exercises practice working with complex, related datasets - reflecting the reality that humanities data often comes in interconnected pieces.

**Exercise Focus Areas:**
- Using `zip()` to combine related information from multiple sources
- Analyzing parallel collections (names, dates, locations, etc.)
- Creating comprehensive reports from distributed data
- Managing relationships between different data categories

### Frequency Analysis and Pattern Recognition

These exercises develop skills for identifying patterns and trends in cultural collections - fundamental to many digital humanities research questions.

**Exercise Focus Areas:**
- Using `Counter` for systematic frequency analysis
- Identifying most and least common patterns
- Comparing frequency distributions across different collections
- Drawing research conclusions from quantitative patterns

### Historical Demographic Analysis

Following the Walsh textbook's approach using historical immigration data, these exercises work with demographic and social data to understand how people were categorized and counted in historical contexts.

**Exercise Focus Areas:**
- Analyzing historical population data
- Understanding categorization systems in historical records
- Examining representation and bias in historical datasets
- Connecting quantitative analysis to broader historical questions

### Option 1: Analyze a Historical Timeline

In [None]:
# Create a list of historical events with dates
historical_events = [
    (1776, "Declaration of Independence"),
    (1789, "French Revolution begins"),
    (1865, "End of American Civil War"),
    (1914, "World War I begins"),
    (1969, "Moon landing")
]

# Your task: 
# 1. Add 3 more historical events
# 2. Sort the events chronologically
# 3. Print them in a formatted way
# 4. Find events from a specific century

# Write your code here:


### Option 2: Literary Analysis

In [None]:
# Analyze the length and characteristics of book titles
classic_novels = [
    "Pride and Prejudice",
    "1984",
    "To Kill a Mockingbird",
    "The Great Gatsby",
    "One Hundred Years of Solitude"
]

# Your task:
# 1. Add 5 more book titles
# 2. Calculate the average title length
# 3. Find the longest and shortest titles
# 4. Count how many titles contain specific words (like "The" or "and")

# Write your code here:


### Option 3: Museum Collection Analysis

In [None]:
# Analyze a museum collection
artifacts = [
    ("Ancient Greek Vase", "Ancient Greece", "Pottery"),
    ("Medieval Manuscript", "Medieval Europe", "Document"),
    ("Renaissance Painting", "Renaissance Italy", "Artwork"),
    ("Egyptian Papyrus", "Ancient Egypt", "Document")
]

# Your task:
# 1. Add more artifacts with (name, origin, type)
# 2. Count artifacts by type
# 3. List all artifacts from a specific time period
# 4. Create a summary report

# Write your code here:


## Reflection Questions

As we wrap up our exploration of lists and loops, consider these questions:

1. **Power of Counting**: How does systematic counting change the way we understand collections of cultural materials?

2. **Representation**: When we organize and count cultural data, what might we be missing? What biases might be built into our categories?

3. **Scale**: How does computational analysis allow us to work with collections that would be impossible to analyze manually?

4. **Human vs. Machine**: What kinds of insights can we gain from computational counting that human reading might miss? What might human analysis reveal that counting cannot?

Write your thoughts in the cell below:

### Your Reflections:

(Double-click this cell to edit and write your thoughts)


## Key Takeaways

Today we've learned:

- **Lists** help us organize collections of data systematically
- **List indexing and slicing** allow us to extract specific portions of our collections for focused analysis
- **List methods** provide tools for adding, organizing, and analyzing our collections
- **Conditional logic** enables us to make systematic decisions about our data using comparison and logical operators
- **Loops** allow us to process each item in a collection systematically
- **Building and filtering lists** lets us create new collections based on specific research criteria
- **Advanced tools** like `enumerate()`, `zip()`, and `Counter` provide professional-grade data analysis capabilities
- **List comprehensions** offer elegant ways to filter and transform collections
- **Data cleaning techniques** prepare messy real-world cultural data for reliable analysis
- **Counting and analysis** can reveal patterns in cultural and historical data
- **Critical thinking** about data representation is essential in digital humanities

These are fundamental building blocks for digital humanities research. As you continue your studies, you'll use these concepts to analyze texts, examine historical patterns, and explore cultural collections in new ways.

Remember: every time we count, we're making choices about what matters and how to categorize the world. In digital humanities, being conscious of these choices is just as important as the technical skills themselves. The tools we've learned today will prepare you for the specific exercises in Walsh's "Intro to Cultural Analytics" textbook, while maintaining the critical perspective essential to humanities scholarship.