This is an individual assignment for the Formative One - Regex Onboarding Hackathon. As a junior full-stack developer, I've created a web application that aggregates data from various sources. This project utilizes the raw power of regular expressions to efficiently extract specific data types from hundreds of pages of text. The repository contains a Python script that validates different data formats, demonstrating the ability to handle various data types and edge cases.
The assignment required the implementation of at least four of the following data extractions. This project includes a command-line interface for validating these data formats:
- Email Addresses: Validates email formats like
user@example.com
andfirstname.lastname@company.co.uk
. - URLs: Validates URLs such as
https://www.example.com
andhttps://subdomain.example.org/page
. - Phone Numbers: Validates various phone number formats, including
(123) 456-7890
,123-456-7890
, and123.456.7890
. - Credit Card Numbers: Validates credit card formats like
1234 5678 90123456
and1234-5678-9012-3456
. - Time: Validates time in both 24-hour (
14:30
) and 12-hour (2:30 PM
) formats. - HTML Tags: Validates HTML tags like
<p>
,<div class="example">
, and<img src="image.jpg" alt="description">
. - Hashtags: Validates hashtags such as
#example
and#ThisIsAHashtag
. - Currency Amounts: Validates currency amounts like
$19.99
and$1,234.56
.
- Python 3.x: This project is implemented in Python.
re
module: The built-in Pythonre
module is used for regular expression operations.- GitHub Repository: The project is hosted on GitHub under the name
alu_regex-data-extraction-Git-with-gideon
, with the account created using your ALU email address.
- Clone this repository:
git clone https://github.com/Git-with-gideon/alu_regex-data-extraction-Git-with-gideon.git
- Navigate to the project directory:
cd alu_regex-data-extraction-Git-with-gideon
- Run the script from your terminal:
python main.py
orpython3 main.py
- Follow the on-screen prompts to select a validation type and enter a string.
- The script will tell you whether the input is valid or not.
There are predefined inputs, and the functions will run on a predefined set of inputs
- Code Quality: The code is clean, readable, and well-documented to explain the logic behind the regex patterns and functions.
- README: This file provides a detailed overview of the project and setup instructions.
- Test Cases: The repository includes sample inputs and their corresponding outputs to demonstrate the functionality of the regex solutions.
-------------------------------------
Welcome to the Regex Data Validator!
Select a data type to validate:
1. Email Addresses
2. URLs
3. Phone Numbers
4. Hashtags
5. Currency Amounts
6. Exit
-------------------------------------
Enter your choice (1-6):
The script uses the following regular expression patterns for validation:
- Email:
r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
- URL:
r'^(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/[a-zA-Z0-9]+\.[^\s]{2,}|[a-zA-Z0-9]+\.[^\s]{2,})$'
- Phone Number:
r'^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$'
- Credit Card:
r'^\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}$'
- Time:
r'^(?:[01]\d|2[0-3]):[0-5]\d(?::[0-5]\d)?$|^(?:1[0-2]|0?[1-9]):[0-5]\d(?::[0-5]\d)?\s?(?:am|pm)$'
- HTML Tag:
r'^<([a-z]+)([^>]*)>(.*?)<\/\1>$|^<([a-z]+)([^>]*)\/>$'
- Hashtag:
r'^#([a-zA-Z0-9_]+)$'
- Currency:
r'^\$?\d{1,3}(?:,?\d{3})*(?:\.\d{2})?$'
- Erioluwa Gideon Olowoyo : Full-stack developer in training.