# üìö Collections and Data Structures

**Phase 1 (Beginner) - Module 4 of 5**

**Estimated time**: 80-100 minutes

**Prerequisites**: [03_Functions_Methods.ipynb](03_Functions_Methods.ipynb)

## üéØ Learning Goals

By the end of this module, you'll be able to:
- Work with Scala's collection types (List, Set, Map, Array)
- Understand immutable vs mutable collections
- Use functional operations (map, filter, reduce)
- Choose the right collection for your needs
- Work with tuples for grouping data
- Create and manipulate different data structures

---

## üìã Table of Contents

1. [Introduction to Collections](#intro)
2. [Lists](#lists)
3. [Sets](#sets)
4. [Maps](#maps)
5. [Arrays](#arrays)
6. [Functional Operations](#operations)
7. [Tuples](#tuples)
8. [Exercises](#exercises)
9. [Next Steps](#next)

## üìä Introduction to Collections

Collections are containers that hold multiple values. Scala has rich collection types that are more powerful than arrays in other languages.

In [None]:
// First, let's explore what collections are
val numbers = List(1, 2, 3, 4, 5)
val fruits = List("apple", "banana", "orange")
val mixed = List("hello", 42, true, 3.14)

println(s"Numbers: $numbers")
println(s"Fruits: $fruits")
println(s"Mixed types: $mixed")
println(s"Mixed type: ${mixed.getClass}")

println("\nCollection operations:")
println(s"First number: ${numbers.head}")
println(s"Rest of numbers: ${numbers.tail}")
println(s"Is empty? ${numbers.isEmpty}")
println(s"Size: ${numbers.size}")
println(s"Contains 3? ${numbers.contains(3)}")

**Collection Hierarchy:**
```
Iterable (trait)
‚îú‚îÄ‚îÄ Seq (ordered sequences)
‚îÇ   ‚îú‚îÄ‚îÄ List
‚îÇ   ‚îú‚îÄ‚îÄ Vector
‚îÇ   ‚îî‚îÄ‚îÄ ArrayBuffer
‚îú‚îÄ‚îÄ Set (unique elements)
‚îÇ   ‚îú‚îÄ‚îÄ HashSet
‚îÇ   ‚îî‚îÄ‚îÄ TreeSet
‚îî‚îÄ‚îÄ Map (key-value pairs)
    ‚îú‚îÄ‚îÄ HashMap
    ‚îî‚îÄ‚îÄ TreeMap
```

**Immutable vs Mutable:**
- **Immutable**: Cannot be changed (default, thread-safe)
- **Mutable**: Can be modified (use `scala.collection.mutable`)

**When to use each:**
- `List`: When you need ordering and sequential access
- `Set`: When you need uniqueness
- `Map`: When you need key-value lookup
- `Array`: When you need fast random access

## üìù Lists

Ordered, immutable sequences of elements.

In [None]:
// Creating lists
val emptyList = List()
val numbers = List(1, 2, 3, 4, 5)
val strings = List("Scala", "is", "awesome")
val doubleList = List.fill(3)(42)  // List(42, 42, 42)
val rangeList = List.range(1, 6)   // List(1, 2, 3, 4, 5)

println(s"Empty: $emptyList")
println(s"Numbers: $numbers")
println(s"Strings: $strings")
println(s"Filled: $doubleList")
println(s"Range: $rangeList")

// List operations
println("\nList operations:")
println(s"First: ${numbers.head}")
println(s"Last: ${numbers.last}")
println(s"All but first: ${numbers.tail}")
println(s"All but last: ${numbers.init}")
println(s"Take 3: ${numbers.take(3)}")
println(s"Drop 2: ${numbers.drop(2)}")
println(s"Slice 1-3: ${numbers.slice(1, 4)}")

In [None]:
// Adding elements (returns new list)
val original = List(1, 2, 3)

val prepend = 0 +: original     // Prepend single element
val append = original :+ 4       // Append single element
val concat = original ++ List(4, 5, 6)  // Concat lists
val insert = original.patch(1, List(10, 20), 1)  // Insert at position

println(s"Original: $original")
println(s"Prepend 0: $prepend")
println(s"Append 4: $append")
println(s"Concat: $concat")
println(s"Insert at index 1: $insert")

// Pattern matching with lists
def describeList(list: List[Int]): String = list match {
  case Nil => "Empty list"
  case head :: Nil => s"Single element: $head"
  case head :: tail => s"Starts with $head, has ${tail.size} more elements"
}

println("\nPattern matching:")
println(describeList(List()))
println(describeList(List(42)))
println(describeList(List(1, 2, 3, 4, 5)))

## üéØ Sets

Collections of unique elements with no ordering.

In [None]:
// Creating sets (unique elements only)
val emptySet = Set()
val numbers = Set(1, 2, 3, 2, 1, 4)  // Duplicates removed
val strings = Set("apple", "banana", "apple", "cherry")

println(s"Empty set: $emptySet")
println(s"Numbers set: $numbers")  // Note: only unique values
println(s"Strings set: $strings")

// Set operations
val set1 = Set(1, 2, 3, 4, 5)
val set2 = Set(4, 5, 6, 7, 8)

println("\nSet operations:")
println(s"Set1: $set1")
println(s"Set2: $set2")
println(s"Union: ${set1 union set2}")
println(s"Intersection: ${set1 intersect set2}")
println(s"Difference (set1 - set2): ${set1 diff set2}")
println(s"Difference (set2 - set1): ${set2 diff set1}")

// Set membership
println(s"Set1 contains 3: ${set1.contains(3)}")
println(s"Set1 subset of itself: ${set1.subsetOf(set1)}")
println(s"Empty subset of set1: ${Set().subsetOf(set1)}")

In [None]:
// Adding/removing elements (returns new set)
val original = Set(1, 2, 3)

val added = original + 4          // Add element
val addedMultiple = original ++ Set(4, 5)  // Add multiple
val removed = original - 2        // Remove element
val removedMultiple = original -- Set(1, 3)  // Remove multiple

println(s"Original: $original")
println(s"Add 4: $added")
println(s"Add 4,5: $addedMultiple")
println(s"Remove 2: $removed")
println(s"Remove 1,3: $removedMultiple")

// Mutable sets (can be modified)
import scala.collection.mutable

val mutableSet = mutable.Set(1, 2, 3)
println(s"\nMutable set before: $mutableSet")

mutableSet.add(4)
mutableSet.remove(2)
println(s"Mutable set after: $mutableSet")

// Use cases
val fruits = Set("apple", "banana", "orange")
val colors = Set("red", "yellow", "orange")

println("\nFruits and colors:")
println(s"Fruits: $fruits")
println(s"Colors: $colors")
println(s"Common (intersection): ${fruits intersect colors}")
println(s"Unique to fruits: ${fruits diff colors}")
println(s"Unique to colors: ${colors diff fruits}")

## üó∫Ô∏è Maps

Collections of key-value pairs.

In [None]:
// Creating maps
val emptyMap = Map()
val ages = Map("Alice" -> 25, "Bob" -> 30, "Charlie" -> 35)
val grades = Map(("Math", 95), ("English", 87), ("Science", 92))

println(s"Empty map: $emptyMap")
println(s"Ages: $ages")
println(s"Grades: $grades")

// Accessing map values
println("\nAccessing values:")
println(s"Alice's age: ${ages("Alice")}")
println(s"Bob's age: ${ages.get("Bob")}")
println(s"Unknown person's age: ${ages.get("Unknown")}")  // None
println(s"Unknown with default: ${ages.getOrElse("Unknown", -1)}")

// Map operations
println(s"\nMap operations:")
println(s"Keys: ${ages.keys}")
println(s"Values: ${ages.values}")
println(s"Contains Alice: ${ages.contains("Alice")}")
println(s"Size: ${ages.size}")
println(s"Is empty: ${ages.isEmpty}")

In [None]:
// Adding/updating elements (returns new map)
val original = Map("Alice" -> 25, "Bob" -> 30)

val added = original + ("Charlie" -> 35)       // Add entry
val updated = original + ("Alice" -> 26)       // Update existing
val removed = original - "Bob"                  // Remove entry
val addedMultiple = original ++ Map("Charlie" -> 35, "David" -> 40)

println(s"Original: $original")
println(s"Add Charlie: $added")
println(s"Update Alice: $updated")
println(s"Remove Bob: $removed")
println(s"Add multiple: $addedMultiple")

// Mutable maps
val mutableMap = scala.collection.mutable.Map("Alice" -> 25)
println(s"\nMutable map before: $mutableMap")

mutableMap.put("Bob", 30)
mutableMap.update("Alice", 26)
mutableMap.remove("Alice")
println(s"Mutable map after: $mutableMap")

In [None]:
// Map transformations
val grades = Map("Alice" -> 85, "Bob" -> 92, "Charlie" -> 78, "David" -> 96)

println(s"Original grades: $grades")

// Filter passing grades (>= 80)
val passingGrades = grades.filter { case (name, score) => score >= 80 }
println(s"Passing grades: $passingGrades")

// Transform values (add curve)
val curvedGrades = grades.map { case (name, score) => (name, math.min(100, score + 5)) }
println(s"Curved grades: $curvedGrades")

// Convert to different format
val letterGrades = grades.map {
  case (name, score) =>
    val letter = if (score >= 90) "A" else if (score >= 80) "B" else "C"
    (name, letter)
}
println(s"Letter grades: $letterGrades")

// Group students by grade range
val groupedByGrade = grades.groupBy {
  case (name, score) => if (score >= 90) "A" else if (score >= 80) "B" else "C"
}
println(s"Grouped by grade: $groupedByGrade")

## üóÇÔ∏è Arrays

Mutable, fixed-size sequences with fast random access.

In [None]:
// Creating arrays
val emptyArray = Array.empty[Int]
val numbers = Array(1, 2, 3, 4, 5)
val strings = Array("Scala", "is", "fun")
val sizedArray = new Array[Int](5)  // Size 5, all zeros

println(s"Empty array: ${emptyArray.mkString(", ")}")
println(s"Numbers: ${numbers.mkString(", ")}")
println(s"Strings: ${strings.mkString(" ")}")
println(s"Sized array: ${sizedArray.mkString(", ")}")

// Array access and modification
println("\nArray access:")
println(s"numbers(0): ${numbers(0)}")    // Access (using parentheses, not brackets!)
println(s"numbers(2): ${numbers(2)}")

numbers(2) = 99  // Modify element
println(s"After numbers(2) = 99: ${numbers.mkString(", ")}")

// Array operations
println(s"\nArray operations:")
println(s"Length: ${numbers.length}")
println(s"Contains 99: ${numbers.contains(99)}")
println(s"Index of 99: ${numbers.indexOf(99)}")
println(s"Sum: ${numbers.sum}")
println(s"Max: ${numbers.max}")

In [None]:
// Converting between collections
val list123 = List(1, 2, 3)
val array123 = Array(1, 2, 3)
val set123 = Set(1, 2, 3)

println(s"Original list: $list123")
println(s"Original array: ${array123.mkString(", ")}")
println(s"Original set: $set123")

// Convert list to array
val asArray = list123.toArray
println(s"List to array: ${asArray.mkString(", ")}")

// Convert array to list
val asList = array123.toList
println(s"Array to list: $asList")

// Convert set to list
val asList2 = set123.toList
println(s"Set to list: $asList2")

// Multi-dimensional arrays
val matrix = Array.ofDim[Int](2, 3)  // 2x3 matrix
matrix(0)(0) = 1
matrix(0)(1) = 2
matrix(0)(2) = 3
matrix(1)(0) = 4
matrix(1)(1) = 5
matrix(1)(2) = 6

println("\n2x3 Matrix:")
for (row <- matrix) {
  println(row.mkString("	"))
}

## ‚öôÔ∏è Functional Operations

Powerful operations that treat collections as pipelines of transformations.

In [None]:
val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

println(s"Original numbers: $numbers")

// map: transform each element
val doubled = numbers.map(_ * 2)
println(s"Doubled: $doubled")

// filter: keep only elements that match condition
val evens = numbers.filter(_ % 2 == 0)
val greaterThan5 = numbers.filter(_ > 5)
println(s"Evens: $evens")
println(s"Greater than 5: $greaterThan5")

// reduce: combine all elements into single value (left-associative)
val sum = numbers.reduce(_ + _)
val product = numbers.reduce(_ * _)
println(s"Sum: $sum, Product: $product")

// fold: reduce with initial value
val sumFrom10 = numbers.fold(10)(_ + _)
println(s"Sum from 10: $sumFrom10")

// find: find first element that matches
val firstEven = numbers.find(_ % 2 == 0)
val firstGT10 = numbers.find(_ > 10)
println(s"First even: $firstEven, First > 10: $firstGT10")

In [None]:
// Advanced operations
val words = List("Scala", "is", "awesome", "and", "powerful")

println(s"Words: $words")

// flatMap: transform and flatten
val letters = words.flatMap(_.toLowerCase.toList)
println(s"All letters: $letters")

// groupBy: group elements by a function
val groupedByLength = words.groupBy(_.length)
println(s"Grouped by length: $groupedByLength")

// sortBy: sort by a function
val sortedByLength = words.sortBy(-_.length)  // descending
println(s"Sorted by length desc: $sortedByLength")

// partition: split into two collections
val (shortWords, longWords) = words.partition(_.length <= 3)
println(s"Short words: $shortWords")
println(s"Long words: $longWords")

// collector examples
val texts = List("Hello", "world", "from", "Scala")
val totalChars = texts.map(_.length).sum
val longestWord = texts.maxBy(_.length)
val avgWordLen = texts.map(_.length.toDouble).sum / texts.size

println(f"\nText stats:")
println(f"Total characters: $totalChars")
println(f"Longest word: $longestWord")
println(f"Average word length: $avgWordLen%.1f")

## üì¶ Tuples Redux

Tuples are great for returning multiple values from functions.

In [None]:
// Functions returning tuples
def divideNumbers(dividend: Int, divisor: Int): (Int, Int) = {
  val quotient = dividend / divisor
  val remainder = dividend % divisor
  (quotient, remainder)
}

def getStats(numbers: List[Int]): (Int, Int, Double, Int) = {
  val sum = numbers.sum
  val count = numbers.size
  val average = sum.toDouble / count
  val max = numbers.max
  (sum, count, average, max)
}

// Using tuple-returning functions
val (quotient, remainder) = divideNumbers(17, 5)
println(f"17 √∑ 5 = $quotient with remainder $remainder")

val (total, count, avg, maximum) = getStats(List(10, 20, 30, 40, 50))
println(f"Stats: Sum=$total, Count=$count, Avg=$avg%.2f, Max=$maximum")

// Processing collections of tuples
val employees = List(
  ("Alice", "Developer", 75000),
  ("Bob", "Manager", 85000),
  ("Charlie", "Designer", 65000),
  ("Diana", "Developer", 78000)
)

println("\nEmployees:")
employees.foreach {
  case (name, role, salary) =>
    println(f"$name%-8s | $role%-10s | $$$salary%,d")
}

// Analyzing tuple data
val developerSalaries = employees.collect {
  case (name, "Developer", salary) => (name, salary)
}
println(s"\nDeveloper salaries: $developerSalaries")

val avgSalary = employees.map(_._3).sum.toDouble / employees.size
println(f"Average salary: $$$avgSalary%.0f")

## üèÜ Exercises

### Exercise 1: Word Processor

Create a program that analyzes text using collections.

In [None]:
// Exercise 1: Word Processor
// FIXME: Replace ??? with your code

def analyzeText(text: String): Map[String, Int] = {
  // Count frequency of each word
  text.toLowerCase
      .split("\\s+")
      .groupBy(identity)
      .map { case (word, occurrences) => (word, occurrences.size) }
}

def getTopWords(wordFreq: Map[String, Int], topN: Int = 5): List[(String, Int)] = {
  // Get top N most frequent words
  ???
}

def findPalindromes(words: Set[String]): Set[String] = {
  // Find words that are palindromes
  ???
}

// Test the functions
val sampleText = "Scala is a powerful programming language. " +
                 "Scala programs are concise and powerful. " +
                 "Functional programming with Scala is great."

val wordFrequencies = analyzeText(sampleText)
println(s"Word frequencies: $wordFrequencies")

val top5Words = getTopWords(wordFrequencies)
println(s"Top 5 words: $top5Words")

val uniqueWords = wordFrequencies.keySet
val palindromes = findPalindromes(uniqueWords)
println(s"Palindrome words: $palindromes")

### Exercise 2: Student Grade Analyzer

Create a program that analyzes student grades using maps and lists.

In [None]:
// Exercise 2: Student Grade Analyzer
// FIXME: Replace ??? with your code

def letterGrade(score: Int): String = {
  ???  // Convert numeric score to letter grade (90-100:A, 80-89:B, etc.)
}

def processGrades(grades: Map[String, List[Int]]): Map[String, (Double, String, Int)] = {
  // For each student, calculate: (average_score, letter_grade, num_subjects)
  ???
}

def classStatistics(processedGrades: Map[String, (Double, String, Int)]): 
                   (Double, String, List[String]) = {
  // Return: (class_average, most_common_grade, top_students)
  ???
}

// Sample data: Student -> List of scores
val studentGrades = Map(
  "Alice" -> List(85, 92, 88),
  "Bob" -> List(78, 82, 90),
  "Charlie" -> List(95, 98, 94),
  "Diana" -> List(88, 85, 89),
  "Eve" -> List(72, 79, 76)
)

println("Student Grade Analysis:")
println("=" * 40)

// Process individual student grades
val processed = processGrades(studentGrades)
for ((student, (avg, grade, subjects)) <- processed) {
  println(f"$student%-8s: Average=${avg}%.1f, Grade=$grade, Subjects=$subjects")
}

println("\nClass Statistics:")
val (classAvg, commonGrade, topStudents) = classStatistics(processed)
println(f"Class average: $classAvg%.1f")
println(s"Most common grade: $commonGrade")
println(s"Top performers: ${topStudents.mkString(", ")}")

### Exercise 3: Inventory Manager

Create an inventory system using different collection types.

In [None]:
// Exercise 3: Inventory Manager
// FIXME: Replace ??? with your code

// Product definition using case class (introduced later but useful here)
case class Product(id: Int, name: String, price: Double, category: String, stock: Int)

// Inventory as Map[ProductId, Product]
val inventory = ???  // Map with sample products

def findByCategory(category: String): List[Product] = {
  ???
}

def calculateInventoryValue(): Double = {
  ???  // Sum of (price * stock) for all products
}

def getLowStockProducts(threshold: Int = 5): List[Product] = {
  ???
}

def getProductsByPriceRange(minPrice: Double, maxPrice: Double): List[Product] = {
  ???
}

// Analysis functions
def categorySummary(): Map[String, Int] = {
  ???  // Map[category, count_of_products]
}

def expensiveProducts(limit: Int = 3): List[Product] = {
  ???  // Top N most expensive products
}

// Execute analysis
println("Inventory Analysis:")
println("=" * 30)

println(s"Total inventory value: $$${calculateInventoryValue()}%.2f")
println(s"Products by category: ${categorySummary()}")

println("\nElectronics category:")
findByCategory("Electronics").foreach(p => 
  println(f"  ${p.name}%-15s: $$${p.price}%.2f (${p.stock} in stock)")
)

println("\nLow stock products:")
getLowStockProducts().foreach(p => 
  println(f"  ${p.name}%-15s: ${p.stock} remaining")
)

println("\nMid-range products ($50-$200):")
getProductsByPriceRange(50, 200).foreach(p => 
  println(f"  ${p.name}%-15s: $$${p.price}%.2f")
)

println("\nMost expensive products:")
expensiveProducts(3).foreach(p => 
  println(f"  ${p.name}%-15s: $$${p.price}%.2f")
)

## üìù What Next?

üéâ **Congratulations!** You've mastered Collections and Data Structures!

**You've learned:**
- Scala's collection hierarchy (List, Set, Map, Array)
- Immutable vs mutable collections
- Functional operations (map, filter, reduce, fold)
- Choosing the right collection for your needs
- Working with tuples for multi-value returns
- Transforming and analyzing collection data

**Key Concepts:**
- **Immutable first**: Use immutable collections by default
- **Functional composition**: Chain operations like `filter.map.reduce`
- **Type safety**: Collections maintain type information
- **Performance**: Choose collections based on access patterns
- **Power of pipelines**: Transform data through chained operations

**Best Practices:**
- Prefer immutable collections (`scala.collection.immutable`)
- Use `List` for sequential data
- Use `Set` for unique elements
- Use `Map` for key-value lookups
- Chain operations functionally rather than using loops

**Next Steps:**
1. Complete the exercises - they solidify collection usage patterns
2. Experiment with more complex transformations
3. Move to **05: Exercises** - comprehensive beginner project
4. Check the broader [Beginner Phase progress](../README.md#beginner)

**Advanced Topics to Explore:**
- Parallel collections for performance
- Custom collection implementations
- Stream processing
- Big data collections (Spark RDDs, Datasets)

**Performance Tip:** Choose collections based on access patterns:
- **Fast append/prepend**: List
- **Fast lookup**: Set/Map
- **Fast random access**: Array
- **Sorted data**: SortedSet/SortedMap

---

*"Data is the new oil, and collections are how you refine it."*