**comparing lists**  
- GG > [Difference between two lists in Python](https://www.geeksforgeeks.org/python-difference-two-lists/)
- SO > [How can I compare two lists in python and return matches](https://stackoverflow.com/questions/1388818/how-can-i-compare-two-lists-in-python-and-return-matches) (see "this question already has answers here")
- docs.python.org > [Set Types -- set, frozenset](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset)

In [None]:
# test set
# see Od2Package > Package > check_assets_filenames()
one = [1, 2]
two = [3, 4, 5, 6]
diff = len(one) - len(two)
if diff != 0:
    print("difference in list lengths")
    print(diff)

**an idea for working through the delimited data:**  

In [5]:
# validate column-by-column
# check coll_config for column header first,
# if no use column info from default_config
    # either case header[0] options are:
        # function - call function names in header[1]
        # regex - evaluate values against regex in header [1]
        # string - evaluate values to match exactly string in header[1]

*but how to call methods dynamically, without having to include opening/closing file in **each** method???*

**below response/code are from Google Gemini**:  

> In this approach, the CSV file is opened and read only once during the initialization of the CSVHandler class. The data is stored in the self.data attribute, which is a tuple containing the header and the data rows. Subsequent calls to the check_row method access the data from memory, eliminating the need to reopen the file.

In [None]:
import csv

class CSVHandler:
    def __init__(self, filename):
        self.filename = filename
        self.data = self.load_data()

    def load_data(self):
        with open(self.filename, 'r') as file:
            reader = csv.reader(file)
            header = next(reader)
            data = list(reader)
        return header, data

    def check_row(self, row_number, column_name, value):
         header, data = self.data
         column_index = header.index(column_name)
         if 0 <= row_number < len(data):
            return data[row_number][column_index] == value
         else:
            return False

# Usage
csv_handler = CSVHandler('my_data.csv')
if csv_handler.check_row(0, 'Name', 'John'):
    print("The value is correct")
else:
    print("The value is incorrect or the row number is out of bounds.")

**how to use a dictionary to dynamically select and call a method based on the column header?**

In [None]:
# this works
class MyClass: # this solution from Google Gemini
    def method_a(self):
        return "Method A called"

    def method_b(self):
        return "Method B called"

    def method_c(self):
        return "Method C called"

    def select_and_call_method(self, key):
        method_mapping = {
            "a": self.method_a,
            "b": self.method_b,
            "c": self.method_c,
        }
        
        method = method_mapping.get(key)
        if method:
            return method() # how does this work???
        else:
            return "Invalid key"



In [None]:
obj = MyClass()

print(obj.select_and_call_method("a"))
print(obj.select_and_call_method("b"))
print(obj.select_and_call_method("c"))
print(obj.select_and_call_method("d"))

Method A called
Method B called
Method C called
Invalid key


**I'm having some regex issues**  

***I may need raw to load regexes as raw strings so I'm asking about this:***  

> Here's how to record a raw Python string in YAML using the PyYAML library:

```Python
import yaml
raw_string = r"This is a raw string with \backslashes and newlines \n"
data = {"raw_value": raw_string}
with open("output.yaml", "w") as file:
    yaml.dump(data, file)
```

> The content of output.yaml will be:

```Python
raw_value: 'This is a raw string with \\backslashes and newlines \n'
```

> YAML automatically handles the escaping of special characters within the string, preserving the raw string format. When the YAML file is loaded back into Python, the string will retain its raw format.

```Python
import yaml

with open("output.yaml", "r") as file:
    loaded_data = yaml.safe_load(file)

print(loaded_data["raw_value"])
# Expected output
# This is a raw string with \backslashes and newlines \n
```

***and some more...***  

> To load a raw string from a variable in Python, assign the raw string literal to the variable using the `r` prefix before the string. This ensures that backslashes and other special characters are treated literally.

```Python
raw_string_content = r"This is a raw string with \backslashes and \n newlines."
print(raw_string_content)
# Output: This is a raw string with \backslashes and \n newlines.

variable_with_path = "C:\\Users\\example\\Documents"
raw_string_from_variable = r"{}".format(variable_with_path)
print(raw_string_from_variable)
# Output: C:\Users\example\Documents
```

> It's important to remember that raw strings cannot end with an odd number of backslashes. If you need to include a trailing backslash, you can either add an extra backslash (which will be interpreted as a literal backslash) or use a regular string and escape the backslash.

```Python
invalid_raw_string = r"This is invalid\" # SyntaxError: invalid syntax

valid_raw_string_with_backslash = r"This is valid\\"
print(valid_raw_string_with_backslash)
# Output: This is valid\\

valid_string_with_escaped_backslash = "This is also valid\\"
print(valid_string_with_escaped_backslash)
# Output: This is also valid\
```

In [None]:
# one potential issue - getting raw strings for a regex from yaml
import yaml

with open("config.yaml", "r") as yamlfile:
    config = yaml.safe_load(yamlfile)

In [None]:
# another potential issue - using re correctly to check against regexes...
import csv, re

with open("/home/nebgreb/desktop/test.csv", "r") as csvfile:
    reader = csv.DictReader(csvfile)
    for header in config:
        if config[header][0] == 'regex':
            # test
            # print(config[header][1])
            for row in reader:
                if not re.match(config[header][1], row[header]):
                    print(f"correct value {row[header]}")