## Python syntax for R users
   1. List manipulation
   2. 0-based indexing
   3. Control flow syntax and indentation
   4. for and if
   5. range()
   6. String manipulation and formatting
   7. Dot notation
   8. Python specific data types (tuples, sets, dictionaries)
   9. Functions, lambdas, sorting and filtering


## Python lists and loops

**Problem** Given a list of animal names, access and print the first, third, and last elements, and then print a sub-list containing the second to fourth elements.

In [None]:
# List of animal names
animals = ["cat", "dog", "elephant", "giraffe", "lion", "monkey", "penguin", "zebra"]

# Access and print the first elements and last element
first_animal = animals[0]  # 0-indexing
last_animal = animals[-1]  # negative indexing

# Slice and print a sub-list containing the second to fourth elements
second_to_fourth_animals = animals[1:4]  # Slicing

# Slice and print a sub-list containing the last three elements
last_three_animals = animals[-3:]  # Slicing

# List are mutable, so we can change the value of an element
animals[0] = "tiger"
print(animals)

# List can contain any types
# Same to R, list can contain different types
animals[0] = 1
print(animals)

# Concatenate other list to the end of the list
canadian_animals = ["beaver", "moose"]
# animals = animals + canadian_animals
# animals += canadian_animals
# animals.extend(canadian_animals)
print(animals)

**Problem**: Given a list of random integer values between, describe the list (count, min, max values), count the occurrences of even and odd numbers, and store the counts in separate variables.

In [None]:
# List of random integer values
random_numbers = [23, 17, 41, 6, 33, 45, 2, 19, 39, 12, 27, 22, 30, 4, 10, 40, 48, 36, 7, 1]

# Let's describe the list
print(len(random_numbers))
print(max(random_numbers))
print(min(random_numbers))

In [None]:
# Initialize variables to store counts of even and odd numbers
even_count = 0
odd_count = 0

# Iterate through the list using a for loop
for i in range(len(random_numbers)):
    # Check if the number is even or odd using Logical operator ( ==, !=, >, <, >=, <=, and, or, not)
    if random_numbers[i] % 2 == 0:
        even_count += 1
    else:
        odd_count += 1

# # Iterate through the list using a for loop
# for number in random_numbers: 
#     # Check if the number is even or odd
#     if number % 2 == 0:
#         even_count += 1
#     else:
#         odd_count += 1

## Logical operator R vs Python
* 🔵 R: `&`, `|`, `!`
* 🐍 Python: `and`, `or`, `not`

```python
# Python
True and False # False
1 == 1 and 3 % 2 == 1 # True
```

`and` and `or` are short-circuit operators. This means that the second operand is evaluated only if needed. This is the same behavior as in R with `&&` and `||`.

Important. Priority of operations apply. See [Python operator precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence).

In Python, `&` and `|` are bitwise operators. This means that they operate on the binary representation of the operands. For example, `1 & 3` is `1` because `1` is `01` in binary and `3` is `11` in binary. The result of `&` is `01` which is `1` in decimal. 



In operator R vs Python
* 🔵 R: `%in%`
* 🐍 Python: `in`

```python
# Python
1 in [1, 2, 3] # True
1 in [2, 3, 4] # False
```



## String manipulation

In [None]:
# String creation
greeting = "Hello"
name = "students"

# String concatenation in the form of "Hello, students. Welcome to the Python course!"
message = greeting + ", " + name + ". Welcome to the Python course!"
print(message)

In [None]:
# String formatting using the format method
message = "{}, {}. Welcome to the Python course!".format(greeting, name)
print(message)

In [None]:
# String formatting using f-strings
message = f"{greeting}, {name}. Welcome to the Python course!"
print(message)

In [None]:
# String methods
# Converting string to uppercase
print(greeting.upper())

# Converting string to lowercase
print(greeting.lower())

# Replacing part of a string
message = message.replace("Python", "Python for R programmers")
print(message)

# Splitting a string into a list of words
# R Equivalent to strsplit(message, " ")
words = message.split(" ")
print(words)

# Joining a list of words into a string
# R Equivalent to paste(words, collapse = " ")
print(" ".join(words))

# Stripping leading and trailing whitespace
extra_spaces = "   Hello, world!   "
print(extra_spaces.strip())

## Dot-notation and object-oriented programming

Python is an object-oriented programming language. This means that everything in Python is an object, which has properties and methods that can be accessed using the dot-notation.

* Properties are variables that are associated with the object.
* Methods are functions that are associated with the object.
* Classes are blueprints for creating objects.
* Objects are instances of classes, or in other words, objects are created using classes.
* Properties and methods are accessed using the dot-notation.

Examples:
* `[].append(1)` where `append` is a method function of the list class.
* `"Hello world".upper()`  where `upper` is a method function of the string class.

### Naming conventions (🚨IMPORTANT🚨)

* Class names are written in UpperCamelCase 🐫 (Cars, Animals, SpeciesObservations)
* Variable names, function names, and method names are written in snake_case 🐍 (cars, animals, species_observations)
* Constants and global variables are written in ALL_CAPS (API_KEY, PI)




## 💪 Exercise 1
Sure, here's a simple exercise that requires the use of loops, if statements, and string manipulation.

**Problem:** 

Given a paragraph of text, perform the following tasks:

1. Split the paragraph into sentences. Assume that all sentences end with a period.
2. Print the number of sentences in the paragraph.
3. For each sentence, if the sentence has more than five words, convert it to uppercase, else convert it to lowercase.
4. Print each sentence on a separate line.

Here's an example paragraph:


In [None]:
text = "Hello, my name is Alice. I love Python. Python is a great programming language. It is widely used for data analysis. I hope you enjoy learning Python too!"

# Step 1: Split the paragraph into sentences
sentences = text.split('. ')

# Step 2: Print the number of sentences
print("Number of sentences:", len(sentences))

# Step 3 & 4: Process each sentence
for sentence in sentences:
    # Remove trailing period
    if sentence.endswith('.'):
        sentence = sentence[:-1]

    # Split sentence into words
    words = sentence.split(' ')
    
    # Check number of words and convert case accordingly
    if len(words) > 5:
        sentence = sentence.upper()
    else:
        sentence = sentence.lower()
    
    print(sentence)

## Python specific data types
### Tuple

In [None]:
# List of students (name, age) as tuples
students = [
    ("Alice", 19),
    ("Bob", 17),
    ("Charlie", 21),
    ("David", 18),
    ("Eva", 20),
]

new_student = ("Denis", 22)

# Unpack the tuple
name, age = new_student

* Tuples are created using parentheses `()`.
* Indexing and slicing are supported (same as lists).
* Tuples are immutable lists. This means that once a tuple is created, it cannot be changed.
* Ordered collection of elements.

In the wild, functions might return tuples. For example, the `divmod` function returns a tuple of two values: the quotient and the remainder.

```python
divmod(10, 3) # (3, 1)
```

They are also often used to store data that is not supposed to be changed.

In [None]:
# Example : 
# List students older than 18
older_students = []
for student in students:
    name, age = student  # Tuple unpacking
    if age > 18:
        older_students.append(name)

# Print the filtered list with a custom message
for name in older_students:
    message = f"{name} is older than 18 years old."
    print(message)

### Dictionary

In [None]:
species = {
    "scientific_name": "Dreissena polymorpha",
    "exotic": True,
    "vernacular_name": "zebra mussel",
    "kingdom": "Animalia",
    "class": "Bivalvia",
    "family": "Dreissenidae"
}

# Accessing values using keys
print(species["scientific_name"])

# Adding a new key-value pair
species["phylum"] = "Mollusca"

# Updating an existing key-value pair
species["exotic"] = False

# Removing a key-value pair
del species["class"]

# Iterating through a dictionary
for key, value in species.items():
    print(f"{key}: {value}")

* Dictionaries are created using curly braces `{}`.
* Mutable collection of key-value pairs.
* Indexing using *keys* to access *values*.
* Can contain any type of data.
* Useful for storing data that will be accessed using a key.

#### Dictionary example : Counting the number of words in a text

In [None]:
# Example : Counting the number of words in a text
text = "Hello, my name is Vincent. I love Python. Python is a great programming language. It is widely used for data analysis. I hope you enjoy learning Python too!"

# Replace punctuation in the text with empty string to get only words
text = text.replace(".", "").replace(",", "")

# Split the text into words
words = text.split(" ")

# Create a dictionary to hold the counts
word_counts = {}

# Count the occurrence of each word
for word in words:
    word = word.lower()  # Convert to lowercase to count 'Python' and 'python' as the same word
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

# Print the word counts
for word, count in word_counts.items():
    print(f"{word}: {count}")

### Sets

In [None]:
# Let's create a set of fruits
fruits = {"apple", "banana", "cherry", "apple"}

# Even though we added "apple" twice, it will appear only once in the set
print(fruits)  # Output: {"apple", "banana", "cherry"}

# Check if "banana" is in the set
print("banana" in fruits)  # Output: True

# Add an element to the set
fruits.add("grape")
print(fruits)  # Output: {"apple", "banana", "cherry", "grape"}

# Add multiple elements to the set
fruits.update(["orange", "kiwi"])
print(fruits)  # Output: {"apple", "banana", "cherry", "grape", "orange", "kiwi"}

# Remove "apple" from the set
fruits.remove("apple")
print(fruits)  # Output: {"banana", "cherry", "grape", "orange", "kiwi"}

# Set's length
print(len(fruits))  # Output: 5

# We can also perform set operations like union, intersection, and difference
other_fruits = {"strawberry", "banana", "mango"}
print(fruits.union(other_fruits))  # Output: {"banana", "cherry", "grape", "orange", "kiwi", "strawberry", "mango"}

print(fruits.intersection(other_fruits))  # Output: {"banana"}

print(fruits.difference(other_fruits))  # Output: {"cherry", "grape", "orange", "kiwi"}

* Sets are created using curly braces `{}`.
* Unordered and unindexed collection of unique elements.
* Elements cannot be accessed using indexing or slicing.
* Useful for removing duplicates from a list.
* Useful for performing set operations such as union, intersection, difference, etc.

💪 Exercise 2

You are given a list of observations with a reference list of species for a site. You have to perform the following tasks:
* Print the number of observations per year.
* Print the number of observation for an exotic species.
* Print the number of observations per kingdom.

In [None]:
observations = [('1990-01-01', 'Dreissena polymorpha'),
                ('1996-01-04', 'Fallopia japonica'),
                ('2013-06-07', 'Lonicera japonica'),
                ('1999-02-06', 'Populus tremuloides'),
                ('2008-06-12', 'Phragmites australis'),
                ('2004-01-25', 'Tamiasciurus hudsonicus'),
                ('2015-12-21', 'Populus tremuloides'),
                ('2008-06-08', 'Salmo salar'),
                ('2011-12-15', 'Picea glauca'),
                ('2007-12-27', 'Populus tremuloides'),
                ('1998-07-11', 'Carpinus caroliniana'),
                ('2010-02-25', 'Alces alces'),
                ('2011-12-28', 'Tamiasciurus hudsonicus'),
                ('1998-01-03', 'Ursus americanus'),
                ('2011-12-28', 'Fallopia japonica'),
                ('2003-06-24', 'Puma concolor'),
                ('2001-07-23', 'Lonicera japonica'),
                ('1994-07-22', 'Carpinus caroliniana'),
                ('2007-11-19', 'Castor canadensis'),
                ('1992-09-17', 'Quercus rubra')]
                
species = [
    {"scientific_name": "Dreissena polymorpha", "exotic": True, "vernacular_name": "zebra mussel",
        "kingdom": "Animalia", "class": "Bivalvia", "family": "Dreissenidae"},
    {"scientific_name": "Puma concolor", "exotic": False, "vernacular_name": "cougar",
        "kingdom": "Animalia", "class": "Mammalia", "family": "Felidae"},
    {"scientific_name": "Ursus americanus", "exotic": False, "vernacular_name": "black bear",
        "kingdom": "Animalia", "class": "Mammalia", "family": "Ursidae"},
    {"scientific_name": "Alces alces", "exotic": False, "vernacular_name": "moose",
        "kingdom": "Animalia", "class": "Mammalia", "family": "Cervidae"},
    {"scientific_name": "Lynx canadensis", "exotic": False, "vernacular_name": "Canadian lynx",
        "kingdom": "Animalia", "class": "Mammalia", "family": "Felidae"},
    {"scientific_name": "Branta canadensis", "exotic": False, "vernacular_name": "Canada goose",
        "kingdom": "Animalia", "class": "Aves", "family": "Anatidae"},
    {"scientific_name": "Castor canadensis", "exotic": False, "vernacular_name": "North American beaver",
        "kingdom": "Animalia", "class": "Mammalia", "family": "Castoridae"},
    {"scientific_name": "Tamiasciurus hudsonicus", "exotic": False, "vernacular_name": "red squirrel",
        "kingdom": "Animalia", "class": "Mammalia", "family": "Sciuridae"},
    {"scientific_name": "Acer saccharum", "exotic": False, "vernacular_name": "sugar maple",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Sapindaceae"},
    {"scientific_name": "Picea glauca", "exotic": False, "vernacular_name": "white spruce",
        "kingdom": "Plantae", "class": "Pinopsida", "family": "Pinaceae"},
    {"scientific_name": "Larix laricina", "exotic": False, "vernacular_name": "tamarack",
        "kingdom": "Plantae", "class": "Pinopsida", "family": "Pinaceae"},
    {"scientific_name": "Abies balsamea", "exotic": False, "vernacular_name": "balsam fir",
        "kingdom": "Plantae", "class": "Pinopsida", "family": "Pinaceae"},
    {"scientific_name": "Lonicera japonica", "exotic": True, "vernacular_name": "Japanese honeysuckle",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Caprifoliaceae"},
    {"scientific_name": "Fallopia japonica", "exotic": True, "vernacular_name": "Japanese knotweed",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Polygonaceae"},
    {"scientific_name": "Carpinus caroliniana", "exotic": False, "vernacular_name": "American hornbeam",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Betulaceae"},
    {"scientific_name": "Quercus rubra", "exotic": False, "vernacular_name": "red oak",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Fagaceae"},
    {"scientific_name": "Myriophyllum spicatum", "exotic": True, "vernacular_name": "Eurasian watermilfoil",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Haloragaceae"},
    {"scientific_name": "Populus tremuloides", "exotic": False, "vernacular_name": "trembling aspen",
        "kingdom": "Plantae", "class": "Magnoliopsida", "family": "Salicaceae"},
    {"scientific_name": "Phragmites australis", "exotic": True, "vernacular_name": "common reed",
        "kingdom": "Plantae", "class": "Liliopsida", "family": "Poaceae"},
    {"scientific_name": "Esox lucius", "exotic": False, "vernacular_name": "northern pike",
        "kingdom": "Animalia", "class": "Actinopterygii", "family": "Esocidae"},
    {"scientific_name": "Salmo salar", "exotic": False, "vernacular_name": "Atlantic salmon",
        "kingdom": "Animalia", "class": "Actinopterygii", "family": "Salmonidae"}
]



In [None]:
# 1. Iterate over the list of tuples and create a dictionary for each row

year_counts = {}
for row in observations:
    year = row[0][0:4]
    if year not in year_counts:
        year_counts[year] = 0
    year_counts[year] += 1

print(year_counts)

# 2. 
# Create a new dictionary of species by scientific name
species_by_scientific_name = {}
for sp in species:
    species_by_scientific_name[sp['scientific_name']] = sp

# Iterate over observations to get exotic by scientific name
count_exotic = 0
for _, obs_species in observations:
    is_exotic = species_by_scientific_name[obs_species]['exotic']
    if is_exotic:
        count_exotic += 1

print(f'There are {count_exotic} observations of exotic species')

#3
# Initiate counters for each kingdom
kingdom_counts = {}
for sp in species:
    kingdom_counts[sp['kingdom']] = 0

# Iterate over observations to get kingdom by scientific name
for _, obs_species in observations:
    kingdom = species_by_scientific_name[obs_species]['kingdom']
    kingdom_counts[kingdom] += 1

print(kingdom_counts)


In [None]:
# Conclusion : There are better ways to do this, but this is a good start