DocsTranslator is a Java application that attempts to translate documented Java source code (a *-sources.jar
file generated when building a project) into documented Python code that can be imported and utilized in Python scripts, with documentation of available classes/methods visible in the Python IDE.
This project generates Python source files that are intended to be used in conjunction with PySpigot, a Python scripting engine that works within Minecraft. PySpigot utilizes Jython, which is a Python implementation that runs on the JVM. More specifically, this project allows for autocomplete, code suggestions, and documentation usage when writing Python scripts that utilize Java classes.
PySpigot scripts are able to access Java classes at runtime, but one issue is the lack of autocomplete and code suggestions when writing scripts, as none of these Java classes are available when writing Python.
Therefore, the objective of this project is to translate those Java classes (as well as their accompanying documentation) into readable Python source code, so that autocomplete, code suggestions, and documentation are available when writing PySpigot/Jython scripts.
DocsTranslator relies heavily upon the JavaParser library, which, in the most simple terms, reads Java source (.java
) files and turns them into an abstract syntax tree, which can be read programmatically and translated into Python source files with relative ease.
The application runs in a stepwise fashion:
- The application initializes the Maven and JDK sources directories.
- Each translate job defined in the
translateJobs
section of thesettings.yml
is initialized. - Initialized jobs are submitted to a multithreaded executor service and are run in parallel. For each job:
- Java source JAR files (I.E. those that follow the format
*-sources.jar
) are fetched from remote Maven repositories (for artifacts defined in thesettings.yml
) and installed into a local Maven repository using Apache Maven Resolver. - Apache Maven Resolver resolves and fetches dependencies for all artifacts fetched in the previous step, using the scope specified in the
settings.yml
. Runtime dependencies are fetched by default, as these will be accessible at runtime. - The application loops through the contents of all fetched JAR files. When it encounters a Java soruce file (ending in
.java
), the file is parsed with JavaParser, and a best-effort attempt is made to translate the source file into Python code.- Any files not ending in
.java
are ignored.
- Any files not ending in
- Translated
.py
files are placed in the user-defined output folder (generated
by default), in the appropriate package. - An entry is added to the
__init__.py
file in the appropriate package, to allow for importing the python module as one would normally import a Java class in Jython. - Any source files from the Java Standard Library utilized by the previously translated Java source files are also translated in the same process outlined above.
- JDK sources must be downloaded manually and placed in the appropriate folder (
jdk-sources
by default) - This step is only completed if enabled in the
settings.yml
(via thejdkSources.translate
option)
- JDK sources must be downloaded manually and placed in the appropriate folder (
- All
__init__.py
files are generated and placed in their appropriate locations. - Python package-related files (
setup.py
,pyproject.toml
,MANIFEST.in
,LICENSE
) are generated from options specified in thesettings.yml
and are placed in the user-defined output folder (generated
by default).
- Java source JAR files (I.E. those that follow the format
Generated files for each job are placed into subfolders of the output folder according to the job name and version. These generated files are intended to be built into a Python package that can subsequently be installed into a Python virtual environment and imported.
If you found this repository, but you are looking for instructions on how to use autocomplete, code suggestions, and documentation when writing PySpigot scripts, visit PySpigot's documentation.
You may use DocsTranslator to translate Java source files into documented Python code. Download the latest release from the releases page. DocsTranslator is a standalone Java application, so you must run it with java -jar docs-translator.jar
. You will likely want to modify some settings, so see the Settings section below for information on the configuration.
If you encounter any issues while using DocsTranslator, submit an issue report.
DocsTranslator generates log files in the logs
folder of the working directory. A separate log file is created for each job as well as a master log file for the main thread. Log files are created fresh for each run of the application- they are not appended.
The settings.yml
file is the main configuration file for the project. If a settings.yml
file doesn't already exist in the same directory as the DocsTranslator JAR file when it is run, then a default version is generated and placed there.
Options are outlined below.
Use this section to define a list of "translate jobs". Typically, each job is a different version of an artifact: for example, each version of the spigot-api (corresponding to each Minecraft version). To execute translate jobs more quickly, they are run in parallel using Java's ExecutorService.
This value is a list, and each item in the list should contain the following parameters:
pyPIName
: The name of the package (filled into thesettings.py
) for publishing to PyPI for this job.pyPIVersion
: The version of the package (filled into the `settings.py) for publishing to PyPI for this job.artifacts
: A list of artifacts to be translated for this job (in the formatgroupId:artifactId:version
).pyModules
(optional): Modules to include in the final Python package that are not located within another package (with an__init__.py
file). Each item should be a URL pointing to a remotely hosted.py
file.
Options pertaining to batching jobs (running them in parallel).
threads
: The number of threads that the job executor service should use
Options pertaining to fetching the JAR file (and its dependencies) to be translated. Downloaded JARs are placed into a local Maven repository.
path
: The path where the local Maven repository should be placed.useCentral
: If set totrue
, Maven Central will be included as one of the remote repositories to search for dependencies.repositories
: A list of remote repositories to be searched for the listed artifacts to translate (and its dependencies). Each item in the list should contain anid
to identify it and aurl
pointing to the location of the remote repository.deleteOnStart
: If set totrue
, the folder containing the local Maven repository will be deleted when DocsTranslator first runs.excludeArtifacts
: A list of artifacts to exclude. Useful for excluding dependency artifacts from translation.- Use the format
groupId:artifactId
to exclude a single artifact - Use the format
groupId
to exclude all artifacts under a particular group
- Use the format
dependencyScope
: The scope to use to limit transitivity when fetching dependencies of an artifact.
Options pertaining to sources from the Java Standard Library.
translate
: If set totrue
, any utilized Java Standard Library source files will also be translated.path
: The path to the folder where the Java Standard Library sources are located.group
: Used when adding docstrings to the generated.py
modules or Java Standard Library sources.name
: Used when adding docstrings to the generated.py
modules for Java Standard Library sources.version
: Used when adding docstrings to the generated.py
modules for Java Standard Library sources.
Options pertaining to the generated .py
files.
path
: The path to the folder where generated Python sources are placed.deleteOnStart
: If set totrue
, the folder containing generated.py
files will be deleted when DocsTranslator first runs.
Options pertaining to excluding imports in generated .py
files.
packages
: A list of packages to exclude when adding imports to generated.py
files.classes
: A list of Java classes to exclude when adding imports to generated.py
files.
Options that specify the format, structure, and syntax of generated Python code.
Options for module formatting.
docString
: The format of the docstring placed at the top of all generated Python modules.importDeclaration
: The standard format of imports.
Options for class formatting.
declaration
: The format of a class declaration.declarationExtending
: The format of a class declaration when the class extends/implements another class.
Options for enum formatting.
declaration
: The format of an enum declaration.entryRegular
: The format of an enum entry without any arguments.entryWithArgs
: The format of an enum entry with arguments.
Options for formatting of functions.
initDefinition
: The format of the__init__
function for classes.definition
: The format of a regular function definition.parameterRegular
: The format of a regular function parameter.parameterVararg
: The format of a VarArg parameter.returnRegular
: The format of a function return statement if the corresponding Java method does not return anything.returnWithValue
: The format of a function return statement if the corresponding Java method returns a value. Used only for translation of Java annotations (where an element may have a default value).
Options for formatting of fields.
initializer
: The format of a field with a value. Used mainly forstatic
andfinal
fields.
Options for translation of JavaDoc comments into Python docstrings.
author
: The format for the JavaDoc@author
tag.deprecated
: The format for the JavaDoc@deprecated
tag.param
: The format for the JavaDoc@param
tag.typeParam
: The format for any defined type parameter.return
: The format for the JavaDoc@returns
tag.see
: The format for the JavaDoc@see
tag.serial
: The format for the JavaDoc@serial
tag.serialData
: The format for the JavaDoc@serialdata
tag.serialField
: The format for the JavaDoc@serialfield
tag.since
: The format for the JavaDoc@since
tag.throw
: The format for the JavaDoc@throws
tag.version
: The format for the JavaDoc@version
tag.unknown
: The header format for any unknown or unparseable JavaDoc tag.unknownTag
: The format for each unknown tag.
Options for generated setup.py
and pyproject.toml
files.
Options for the generated setup.py
file.
author
: The author of the Python package.authorEmail
: The author's email.description
: The description of the Python package.url
: The URL pointing to the site of the Python package.pythonRequires
: The minimum required Python version to install the package.classifiers
: Trove classifiers for the project.
Options for the generated pyproject.toml
file.
requires
: Required modules/packages to build the project.buildBackend
: The build backend for the project.
A list of lines to include in the MANIFEST.in
file for the Python package.
A URL pointing to the license text to bundle with the Python package.
Because Java is statically-typed, but Python is not, translation of Java types to Python types is not perfect. Additionally, not all Java types are seamlessly interchangeable with Python types. For example:
- The Java
Collection
type is translated to a PythonIterable
, although these types are not wholly interchangeable. - Generics are not fully translated. This could be attempted with the
TypeVar
class (available in Python'styping
module), however, I did not pursue this given that it would become quite completed with Java classes that contain several generic methods. - Python has no direct equivalent to Java's
char
type, which has consequences. For example, consider the following two overloaded methods from theorg.bukkit.ChatColor
class:DocsTranslator translates these to:@Nullable public static ChatColor getByChar(char code) { ... } @Nullable public static ChatColor getByChar(@NotNull String code) { ... }
The translated Python functions are identical, even though they are not identical in Java. DocsTranslator translates@overload @staticmethod def getByChar(code: str) -> "ChatColor": ... @overload @staticmethod def getByChar(code: str) -> "ChatColor": ...
char
tostr
, and, as a consequence, these two translated functions accept the same parameters. - Other examples of imperfect type translation exist. See the TypeUtils class for a better idea on how type translation is handled.
Because JavaDoc strings are ultimately converted into HTML when generating JavaDocs for a Java project, usage of HTML tags/elements are allowed when writing JavaDoc strings. This presents a problem when translating JavaDoc strings to Python docstrings because Python docstrings, unlike JavaDoc strings, do not natively support HTML tags/elements. There are some docstring parsers that use markdown
Based on my own testing, it seems that the Pylance extension in VSCode, the IDE I am using to assess translation quality, interprets some (but not all) Markdown syntax. Therefore, I have attempted to translate as much as I can from HTML to Markdown, however, some HTML tags remain untranslated, namely:
- HTML tags pertaining to tables:
<table>
,<th>
,<tr>
,<td>
, etc. - HTML heading tags:
<h1>
,<h2>
,<h3>
, etc. - Other miscellaneous tags:
<blockquote>
, and more
Building requires Maven and Git. Maven 3+ is recommended for building the project. Follow these steps:
- Clone the repository:
git clone https://github.com/magicmq/docs-translator.git
- Enter the repository root:
cd docs-translator
- Build with Maven:
mvn clean package
- Built files will be located in the
target
directory.
Note: Maven shades Jython and some other runtime dependencies into the final JAR file. The shaded JAR is docs-translator-{VERSION}.jar
, not original-docs-translator-{VERSION}.jar
.
Any contributions you make to DocsTranslator are greatly appreciated.
If you have a suggestion or modification that would make DocsTranslator better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request