This project is a Python-based tool that dynamically generates Python classes, instances, and dependency graphs from a given XML file. The tool analyzes the structure of the XML document and creates Python code that can:
- Define classes based on the XML structure.
- Generate instances for each XML element.
- Export instances to CSV files.
- Visualize class dependencies as a PDF graph using Graphviz.
- Automatic Class Generation: Generates Python classes with attributes and relationships derived from XML tags and attributes.
- Instance Creation: Creates instances for each XML element.
- CSV Export: Automatically saves instances to CSV files.
- Dependency Graph: Generates a PDF graph of class dependencies using Graphviz.
This project requires the following Python libraries:
os
xml.etree.ElementTree
uuid
collections
keyword
pandas
graphviz
Install the required libraries using:
pip install pandas graphviz
The graphviz
library must be installed on your system for generating dependency graphs. Follow the installation guide for your platform:
- Ubuntu/Debian:
sudo apt-get install graphviz
- MacOS:
brew install graphviz
- Windows: Download and install from Graphviz Download Page.
- Place your XML file (e.g.,
drugbank_partial.xml
) in the project directory. - Run the script using:
python your_script_name.py
Call the generate_python_code
function with the path to your XML file:
from your_script_name import generate_python_code
generate_python_code("path_to_your_xml.xml")
The script will generate the following in the generated_code
directory:
- Class Files: Individual Python files for each class.
- Main Script: A
generated_main.py
file to initialize instances and export them to CSV. - Dependency Graph: A
class_dependencies.pdf
file visualizing class relationships.
Escapes problematic characters (e.g., \
, '
, \n
) in strings.
Sanitizes XML tag/attribute names by:
- Removing namespaces.
- Replacing invalid characters with underscores.
- Adding underscores if the name conflicts with Python keywords.
Analyzes the XML structure to:
- Count child elements and attributes.
- Identify potential fields for class definitions.
Generates Python class code for a given class name based on the XML structure.
Creates an instance of a class and recursively handles its child elements.
Parses an XML element and generates corresponding Python code dynamically.
generate_python_code(xml_file: str, output_dir: str = "generated_code", generate_graph: bool = True)
Main function to generate Python code and optional dependency graphs.
Generates a dependency graph of classes and their relationships, saved as a PDF file.
project_directory/
|-- your_script_name.py # Main script
|-- drugbank_partial.xml # Example XML file
|-- generated_code/ # Output directory
|-- base_model.py # Base model class
|-- generated_main.py # Main script to initialize instances
|-- *.py # Generated class files
|-- class_dependencies.pdf # Dependency graph
You can customize the behavior of the script by modifying:
- BaseModel Class: Extend the base model functionality in
base_model.py
. - Dependency Graph: Adjust graph attributes in the
generate_dependency_graph
function.
If an error occurs, the script prints a detailed error message. Ensure the input XML file is well-formed and adheres to XML standards.
This project is licensed under the MIT License. Feel free to use and modify it as needed.
Happy coding!