This repository features a fundamental implementation of a Lexical Analyzer (or Tokenizer) developed entirely in Python. This project serves as a practical showcase for understanding the initial, critical phase of a compiler.
The primary goal of this tool is to perform lexical analysis—the process of reading source code and breaking it down into a sequence of elemental units called tokens.
The analyzer performs the critical task of identifying and categorizing components such as:
- Keywords: Reserved words in the language (e.g.,
if,for). - Identifiers: Names given to variables, functions, etc.
- Operators: Mathematical and logical symbols (e.g.,
+,==). - Literals: Fixed values like numbers and strings.
- Separators/Punctuation: Symbols used for grouping and termination (e.g.,
(,;).
| Item | Details |
|---|---|
| Language | Python (100%) |
| Key Files | code.py, index.py |
| License | Apache License 2.0 |
The code is open-source and released under the Apache-2.0 license, making it freely available for use, modification, and academic study.