[MEDIUM] #1078 - Occurrences After Bigram


# 🎯 Problem: Occurrences After Bigram

## 📖 The Real Problem

You are given:
1. A **text** (string of words separated by spaces)
2. A **first** word
3. A **second** word

Your task: Find all words that **immediately follow** the bigram "first second" in the text.

**What is a bigram?**
- A bigram is a pair of consecutive words
- Example: In "the quick brown fox", bigrams are: (the,quick), (quick,brown), (brown,fox)

**Why this problem exists:**
- Fundamental text processing operation
- Tests string manipulation and pattern matching
- Foundation for more complex NLP tasks

## 💡 Why This Matters

**Real-world applications:**
- **Search engines** - phrase search and autocomplete
- **Text prediction** - next word prediction in keyboards
- **Language modeling** - statistical language models
- **Plagiarism detection** - matching phrase patterns
- **Chatbots** - response generation based on context

**Skills you'll develop:**
- ✅ String splitting and manipulation
- ✅ Pattern matching in text
- ✅ List comprehension
- ✅ Index tracking in arrays

## 📋 Contributor Tasks

### Step 1: Understand the Problem
1. Read the text carefully
2. Identify all occurrences of "first second" pattern
3. For each occurrence, note the word that follows
4. Collect all such words

### Step 2: Plan Your Approach
1. Split text into words (list of words)
2. Iterate through the word list
3. Look for pattern: words[i] == first AND words[i+1] == second
4. When found, add words[i+2] to results (if exists)
5. Return the list of result words

### Step 3: Implement the Solution
1. Split text by spaces into word list
2. Loop through indices from 0 to len-3
3. Check if current and next word match first and second
4. If match, append the word after them to results
5. Return results list

### Step 4: Test Your Solution
1. Test with pattern at beginning
2. Test with pattern at end
3. Test with multiple occurrences
4. Test with no occurrences
5. Test with pattern appearing consecutively

## ✅ Expected Outcome

**Function Signature:**
```python
def findOcurrences(text, first, second):
    """
    Find all words following the bigram 'first second' in text.
    
    Args:
        text: str - input text (space-separated words)
        first: str - first word of bigram
        second: str - second word of bigram
    
    Returns:
        List[str]: list of words following the bigram
    
    Example:
        >>> findOcurrences("alice is a good girl she is a good student", "a", "good")
        ['girl', 'student']
    """
```

**Expected Behavior:**
- ✅ Returns list of all words following "first second"
- ✅ Returns empty list if pattern not found
- ✅ Handles multiple occurrences correctly
- ✅ Case-sensitive matching
- ✅ Preserves order of occurrences

**Example Test Cases:**
```python
# Test 1: Basic example
text = "alice is a good girl she is a good student"
findOcurrences(text, "a", "good")
# Returns: ['girl', 'student']
# Explanation: "a good" appears twice, followed by "girl" and "student"

# Test 2: Pattern at end (no following word)
text = "we will we will rock you"
findOcurrences(text, "we", "will")
# Returns: ['we', 'rock']
# Explanation: "we will" appears twice, followed by "we" and "rock"

# Test 3: No occurrences
text = "hello world"
findOcurrences(text, "foo", "bar")
# Returns: []

# Test 4: Single occurrence
text = "the quick brown fox"
findOcurrences(text, "quick", "brown")
# Returns: ['fox']

# Test 5: Consecutive patterns
text = "a b a b a b c"
findOcurrences(text, "a", "b")
# Returns: ['a', 'a', 'c']
```

## 📚 Additional Context & References

### Understanding Bigrams

**What is a Bigram?**
- A bigram is a sequence of 2 adjacent elements
- In text: 2 consecutive words
- Used extensively in NLP and speech recognition

**Example:**
```
Text: "the cat sat on the mat"
Bigrams: (the,cat), (cat,sat), (sat,on), (on,the), (the,mat)
```

### Implementation Approaches

**Approach 1: Simple Iteration** (Recommended)
```python
def findOcurrences(text, first, second):
    words = text.split()
    result = []
    
    for i in range(len(words) - 2):
        if words[i] == first and words[i+1] == second:
            result.append(words[i+2])
    
    return result
```

**Approach 2: List Comprehension**
```python
def findOcurrences(text, first, second):
    words = text.split()
    return [words[i+2] for i in range(len(words)-2) 
            if words[i] == first and words[i+1] == second]
```

### Hints (Use Only If Stuck!)

<details>
<summary>💡 Hint 1</summary>
Split the text into a list of words first. Then iterate through the list.
</details>

<details>
<summary>💡 Hint 2</summary>
You need to check three consecutive positions: i, i+1, and i+2.
</details>

<details>
<summary>💡 Hint 3</summary>
Make sure you don't go out of bounds when accessing i+2.
</details>

### Complexity Analysis

**Time Complexity:** O(n)
- Split text: O(n) where n is text length
- Single pass through words: O(m) where m is number of words
- Overall: O(n)

**Space Complexity:** O(m)
- Store words list: O(m)
- Store results: O(k) where k is number of occurrences
- Overall: O(m)

### Edge Cases to Consider

1. **Empty text:** Return []
2. **Less than 3 words:** Return [] (can't have bigram + following word)
3. **Pattern at end:** Don't include (no following word)
4. **Overlapping patterns:** Handle correctly (e.g., "a a a" with first="a", second="a")
5. **Case sensitivity:** "The" ≠ "the"

### Related Problems

Once you solve this, try:
- **Valid Palindrome** - String manipulation
- **Reverse Words in a String** - Word processing
- **Implement strStr()** - Pattern matching
- **Text Justification** - Advanced text processing

### Helpful Resources

- [String Split in Python](https://docs.python.org/3/library/stdtypes.html#str.split)
- [N-gram Language Models](https://web.stanford.edu/~jurafsky/slp3/3.pdf)
- [LeetCode 1078 - Occurrences After Bigram](https://leetcode.com/problems/occurrences-after-bigram/)

## 📝 Notes

- Words are separated by single spaces
- Matching is case-sensitive
- Return words in order of occurrence
- Don't include duplicates automatically (only if they appear multiple times)
- Text contains only lowercase English letters and spaces (in most cases)

---

**Ready to contribute?**

1. Fork the repository
2. Create your solution file
3. Test with provided examples
4. Submit a pull request!

**File Location:** `exercises/1000_programs/medium/1078_occurrences_after_bigram.py`

🚀 Happy coding!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MEDIUM] #1078 - Occurrences After Bigram #1865

🎯 Problem: Occurrences After Bigram

📖 The Real Problem

💡 Why This Matters

📋 Contributor Tasks

Step 1: Understand the Problem

Step 2: Plan Your Approach

Step 3: Implement the Solution

Step 4: Test Your Solution

✅ Expected Outcome

📚 Additional Context & References

Understanding Bigrams

Implementation Approaches

Hints (Use Only If Stuck!)

Complexity Analysis

Edge Cases to Consider

Related Problems

Helpful Resources

📝 Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[MEDIUM] #1078 - Occurrences After Bigram #1865

Description

🎯 Problem: Occurrences After Bigram

📖 The Real Problem

💡 Why This Matters

📋 Contributor Tasks

Step 1: Understand the Problem

Step 2: Plan Your Approach

Step 3: Implement the Solution

Step 4: Test Your Solution

✅ Expected Outcome

📚 Additional Context & References

Understanding Bigrams

Implementation Approaches

Hints (Use Only If Stuck!)

Complexity Analysis

Edge Cases to Consider

Related Problems

Helpful Resources

📝 Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions