# Week 5 - Files and Regex
## Regex

You have a piece of text and detect the following structures:
- all strings in which numbers are surrounded by alphabetical chars (h3llo)
- all strings that exist only out of vowels
- all strings that exist only out of consonants
- all strings that exist out of the letters from LOREM --> moerl, lorem, remol, lerom, ...
- all strings that exist out of the letters from a given word (input)

import re
list = re.findall(regex, text)
regex = r"..."


In [8]:
import re

text = """Lrm psm d'lr sit amet, consectetur adipiscing elit. Integer et odio consequat, eleifend eros vulputate, 
pellent+esque ex. Lorem ipsum dolor sit mt, consectetur adipiscing elit. Praesent hendrerit ex eget neque 
varius commodo. Mauris meow aliquam, mauris pr'!t ullamcorper weom dapibus, velit metus hendrerit lacus, at viverra massa 
arcu eu metus. Pellentesque tincidunt mi enim, at vestibulum nulla eleifend vitae. Mauris nec hendrerit nunc. 
Curabitur massa libero, omew iaculis id nisi sit amet, ornare maximus diam. Vestibulum maximus lacus non erat pretium, 
sed feugiat dui laoreet. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Class aptent taciti sociosqu ad 
litora torquent per conubia nostra, per inceptos himenaeos. Fusce justo magna, faucibus vitae turpis et, egestas 
pharetra est. Ut bibendum interdum malesuada. Aliquam tempor justo ac nisl lobortis tempus."""

letters = input()
regex = rf"\b[{letters}{letters.upper()}]{{{len(letters)}}}\b"
list = re.findall(regex, text)
print(list)

['meow', 'weom', 'omew']


### Email validator
Write a python function called validate_email
takes in input and returns true or false for valid or not
- email has to contain "@"
- 2 parts
- part before @ arrobas --> at least 1 char and only: letters, numbers, dots (.), hyphens (-) and underscores (_)
- part after --> at least 2 chars and only letters, dots (.), hyphens (-)
- cannot end in a dot (.)


In [None]:
import re

def validate_email(email):
    if "@" not in email:
        return False
    
    parts = email.split("@")
    if len(parts) != 2:
        return False
    
    regex = r"^[\w\.\-_]+$"  #^ and $ at end cause your whole string to be treated as one. Checks for x in between everything
    #added \ in front of . because without it, . indicates ALL CHARS, 
    #same goes for other meaning of -
    if re.match(regex, parts[0]) is None:
        return False
    
    regex = r"[A-z\.\-]{2,}"
    if re.match(regex, parts[1]) is None:
        return False
    
    if email[-1] == ".":
        return False
    
    return True

print(validate_email("anthony.coppens@thomasmore-be"))
print(validate_email("anthony.coppens@thomasmore@be"))
print(validate_email("anthony.coppens@thomasmore.be."))
print(validate_email("anthony coppens@thomasmore.be"))
print(validate_email("anthony.coppens@thomasmore.be"))
print(validate_email("antho234_-6ny.coppens@thomasmore.be"))



True
False
False
False
True
True


# Files and folders: os and pathlib

In [None]:
import os, pathlib

print(os.getcwd()) #current working directory
print(type(os.getcwd()))
print(pathlib.Path.cwd())
print(type(pathlib.Path.cwd()))
print()

calc = pathlib.Path(r"C:\Windows\System32\calc.exe")
print("Filename: ", calc.name)
print("Name: ", calc.stem)
print("Extension: ", calc.suffix)
print("Parent dir: ", calc.parent)
print(calc.is_file())
print(calc.exists())

print(pathlib.Path.home())

#working with strings
print(str(pathlib.Path.home()) + "\\" + "test") #WINDOWS
print(str(pathlib.Path.home()) + "/" + "test") #MACOS and LINUX

pathToEmail = os.getcwd() + r"\Files\e-mail.txt"
print(pathToEmail)

pathToEmail = pathlib.Path.cwd()/"Files/e-mail.txt"
print(pathToEmail)

c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb
<class 'str'>
c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb
<class 'pathlib.WindowsPath'>

Filename:  calc.exe
Name:  calc
Extension:  .exe
Parent dir:  C:\Windows\System32
True
True
C:\Users\antho
C:\Users\antho\test
C:\Users\antho/test
c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files\e-mail.txt
c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files\e-mail.txt


### directory walkthrough

In [45]:
import os
from pathlib import Path

path = r"c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb"
#path = os.chdir(Path.cwd())
for folderName, subfolders, fileNames in os.walk(path):
    print(f"The current folder is: {folderName}")

    for subfolder in subfolders:
        print(f"Subfolder of {folderName}: {subfolder}")

    for file in fileNames:
        print(f"File inside of {folderName}: {file}")
    
    print("---")

The current folder is: c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb
Subfolder of c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb: Files
Subfolder of c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb: RandomStuff
File inside of c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb: Week5FilesandRegex.ipynb
---
The current folder is: c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files
File inside of c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files: countries.csv
File inside of c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files: dictionary.txt
File inside of c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files: e-

In [47]:
#looking for a file in a path
from pathlib import Path
path = r"c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files"
path_object = Path(path)

for file in path_object.glob("*.txt"):
    print(file)

c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files\dictionary.txt
c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files\e-mail.txt


## Writing files

In [51]:
import os
print(os.getcwd())

#creating and writing
file = open("test.txt", "w") #indicates that we start writing
file.write("test")
file.write("I have no idea")
file.write("\n")
file.write("HELP")
file.close()

file = open("test.txt", "a")
file.write("CHECK")
file.close()


c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb


## Reading from a file

In [None]:
#way 1
file = open("test.txt", "r")
print(file.read())
file.close()

#way 2
print()
with open("test.txt", "r") as file:
    print(file.read())


#encoding problems? trouble reading a file?
file = open("test.txt", "r", encoding="utf-7")
file.close() 

#8-bits: US-ASCII or Latin-1
#16-bits: UTF-16



testI have no idea
HELPCHECK

testI have no idea
HELPCHECK


### Reading emails from a file
check emails with the validate_email() function that we created earlier

In [56]:
import os
os.chdir(r"c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb\Files")
with open("e-mail.txt", "r") as file:
    for line in file.readlines():
        parts = line.split(";")
        print(parts[0], parts[1], "-->", validate_email(parts[0]))

example@example.com True
 --> True
example@sub.domain.com True
 --> True
example-123@sub.domain.com True
 --> True
example123@sub.domain.com True
 --> True
example_123@sub.domain.com True
 --> True
example@sub-domain.com True
 --> True
example@sub.domain False
 --> True
example@sub. False
 --> False
example@sub..com False
 --> True
example@sub.domain.c False
 --> True
example@sub-.domain.com False
 --> True
example@sub_.domain.com False
 --> True
example@sub.domain-.com False
 --> True
example@sub.domain_.com False
 --> True


## organizing files
creating dir, deleting files, copying, moving, ...
work with Send2Trash --> for safe deletes

In [None]:
! pip install Send2Trash

In [63]:
import os, shutil #used to copy and move

os.chdir(r"c:\Users\antho\OneDrive\Documenten\Anthony-TM\Scripting\Scripting-Students-24-sem2\Week5\DSPSb")
#os.mkdir("folder") --> in comment after creation
shutil.copy("test.txt", "folder") #copies file to folder dir
shutil.move("folder/test.txt", "folder/bacon.txt") #takes contents of said file and moves them to file with new name
#shutil.rmtree("folder")


'folder/bacon.txt'

In [None]:
import send2trash
file = open("testfile.txt", "w")
file.close()

send2trash.send2trash("testfile.txt")