Skip to content

magicmq/docs-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sonatype Nexus (Releases) Latest Snapshot GitHub Workflow Status Apache 2.0 License

DocsTranslator

DocsTranslator is a Java application that attempts to translate documented Java source code (a *-sources.jar file generated when building a project) into documented Python code that can be imported and utilized in Python scripts, with documentation of available classes/methods visible in the Python IDE.

Rationale

This project generates Python source files that are intended to be used in conjunction with PySpigot, a Python scripting engine that works within Minecraft. PySpigot utilizes Jython, which is a Python implementation that runs on the JVM. More specifically, this project allows for autocomplete, code suggestions, and documentation usage when writing Python scripts that utilize Java classes.

PySpigot scripts are able to access Java classes at runtime, but one issue is the lack of autocomplete and code suggestions when writing scripts, as none of these Java classes are available when writing Python.

Therefore, the objective of this project is to translate those Java classes (as well as their accompanying documentation) into readable Python source code, so that autocomplete, code suggestions, and documentation are available when writing PySpigot/Jython scripts.

How It Works

DocsTranslator relies heavily upon the JavaParser library, which, in the most simple terms, reads Java source (.java) files and turns them into an abstract syntax tree, which can be read programmatically and translated into Python source files with relative ease.

The application runs in a stepwise fashion:

  1. The application initializes the Maven and JDK sources directories.
  2. Each translate job defined in the translateJobs section of the settings.yml is initialized.
  3. Initialized jobs are submitted to a multithreaded executor service and are run in parallel. For each job:
    1. Java source JAR files (I.E. those that follow the format *-sources.jar) are fetched from remote Maven repositories (for artifacts defined in the settings.yml) and installed into a local Maven repository using Apache Maven Resolver.
    2. Apache Maven Resolver resolves and fetches dependencies for all artifacts fetched in the previous step, using the scope specified in the settings.yml. Runtime dependencies are fetched by default, as these will be accessible at runtime.
    3. The application loops through the contents of all fetched JAR files. When it encounters a Java soruce file (ending in .java), the file is parsed with JavaParser, and a best-effort attempt is made to translate the source file into Python code.
      • Any files not ending in .java are ignored.
    4. Translated .py files are placed in the user-defined output folder (generated by default), in the appropriate package.
    5. An entry is added to the __init__.py file in the appropriate package, to allow for importing the python module as one would normally import a Java class in Jython.
    6. Any source files from the Java Standard Library utilized by the previously translated Java source files are also translated in the same process outlined above.
      • JDK sources must be downloaded manually and placed in the appropriate folder (jdk-sources by default)
      • This step is only completed if enabled in the settings.yml (via the jdkSources.translate option)
    7. All __init__.py files are generated and placed in their appropriate locations.
    8. Python package-related files (setup.py, pyproject.toml, MANIFEST.in, LICENSE) are generated from options specified in the settings.yml and are placed in the user-defined output folder (generated by default).

Generated files for each job are placed into subfolders of the output folder according to the job name and version. These generated files are intended to be built into a Python package that can subsequently be installed into a Python virtual environment and imported.

Usage

If you found this repository, but you are looking for instructions on how to use autocomplete, code suggestions, and documentation when writing PySpigot scripts, visit PySpigot's documentation.

You may use DocsTranslator to translate Java source files into documented Python code. Download the latest release from the releases page. DocsTranslator is a standalone Java application, so you must run it with java -jar docs-translator.jar. You will likely want to modify some settings, so see the Settings section below for information on the configuration.

If you encounter any issues while using DocsTranslator, submit an issue report.

Logging

DocsTranslator generates log files in the logs folder of the working directory. A separate log file is created for each job as well as a master log file for the main thread. Log files are created fresh for each run of the application- they are not appended.

Settings

The settings.yml file is the main configuration file for the project. If a settings.yml file doesn't already exist in the same directory as the DocsTranslator JAR file when it is run, then a default version is generated and placed there.

Options are outlined below.

translateJobs:

Use this section to define a list of "translate jobs". Typically, each job is a different version of an artifact: for example, each version of the spigot-api (corresponding to each Minecraft version). To execute translate jobs more quickly, they are run in parallel using Java's ExecutorService.

This value is a list, and each item in the list should contain the following parameters:

  • pyPIName: The name of the package (filled into the settings.py) for publishing to PyPI for this job.
  • pyPIVersion: The version of the package (filled into the `settings.py) for publishing to PyPI for this job.
  • artifacts: A list of artifacts to be translated for this job (in the format groupId:artifactId:version).
  • pyModules (optional): Modules to include in the final Python package that are not located within another package (with an __init__.py file). Each item should be a URL pointing to a remotely hosted .py file.

batching:

Options pertaining to batching jobs (running them in parallel).

  • threads: The number of threads that the job executor service should use

maven:

Options pertaining to fetching the JAR file (and its dependencies) to be translated. Downloaded JARs are placed into a local Maven repository.

  • path: The path where the local Maven repository should be placed.
  • useCentral: If set to true, Maven Central will be included as one of the remote repositories to search for dependencies.
  • repositories: A list of remote repositories to be searched for the listed artifacts to translate (and its dependencies). Each item in the list should contain an id to identify it and a url pointing to the location of the remote repository.
  • deleteOnStart: If set to true, the folder containing the local Maven repository will be deleted when DocsTranslator first runs.
  • excludeArtifacts: A list of artifacts to exclude. Useful for excluding dependency artifacts from translation.
    • Use the format groupId:artifactId to exclude a single artifact
    • Use the format groupId to exclude all artifacts under a particular group
  • dependencyScope: The scope to use to limit transitivity when fetching dependencies of an artifact.

jdkSources:

Options pertaining to sources from the Java Standard Library.

  • translate: If set to true, any utilized Java Standard Library source files will also be translated.
  • path: The path to the folder where the Java Standard Library sources are located.
  • group: Used when adding docstrings to the generated .py modules or Java Standard Library sources.
  • name: Used when adding docstrings to the generated .py modules for Java Standard Library sources.
  • version: Used when adding docstrings to the generated .py modules for Java Standard Library sources.

output:

Options pertaining to the generated .py files.

  • path: The path to the folder where generated Python sources are placed.
  • deleteOnStart: If set to true, the folder containing generated .py files will be deleted when DocsTranslator first runs.

importExclusions:

Options pertaining to excluding imports in generated .py files.

  • packages: A list of packages to exclude when adding imports to generated .py files.
  • classes: A list of Java classes to exclude when adding imports to generated .py files.

formats:

Options that specify the format, structure, and syntax of generated Python code.

module:

Options for module formatting.

  • docString: The format of the docstring placed at the top of all generated Python modules.
  • importDeclaration: The standard format of imports.

class_:

Options for class formatting.

  • declaration: The format of a class declaration.
  • declarationExtending: The format of a class declaration when the class extends/implements another class.

enum:

Options for enum formatting.

  • declaration: The format of an enum declaration.
  • entryRegular: The format of an enum entry without any arguments.
  • entryWithArgs: The format of an enum entry with arguments.

function:

Options for formatting of functions.

  • initDefinition: The format of the __init__ function for classes.
  • definition: The format of a regular function definition.
  • parameterRegular: The format of a regular function parameter.
  • parameterVararg: The format of a VarArg parameter.
  • returnRegular: The format of a function return statement if the corresponding Java method does not return anything.
  • returnWithValue: The format of a function return statement if the corresponding Java method returns a value. Used only for translation of Java annotations (where an element may have a default value).

field:

Options for formatting of fields.

  • initializer: The format of a field with a value. Used mainly for static and final fields.

docString:

Options for translation of JavaDoc comments into Python docstrings.

  • author: The format for the JavaDoc @author tag.
  • deprecated: The format for the JavaDoc @deprecated tag.
  • param: The format for the JavaDoc @param tag.
  • typeParam: The format for any defined type parameter.
  • return: The format for the JavaDoc @returns tag.
  • see: The format for the JavaDoc @see tag.
  • serial: The format for the JavaDoc @serial tag.
  • serialData: The format for the JavaDoc @serialdata tag.
  • serialField: The format for the JavaDoc @serialfield tag.
  • since: The format for the JavaDoc @since tag.
  • throw: The format for the JavaDoc @throws tag.
  • version: The format for the JavaDoc @version tag.
  • unknown: The header format for any unknown or unparseable JavaDoc tag.
  • unknownTag: The format for each unknown tag.

packaging:

Options for generated setup.py and pyproject.toml files.

setup:

Options for the generated setup.py file.

  • author: The author of the Python package.
  • authorEmail: The author's email.
  • description: The description of the Python package.
  • url: The URL pointing to the site of the Python package.
  • pythonRequires: The minimum required Python version to install the package.
  • classifiers: Trove classifiers for the project.

pyProject:

Options for the generated pyproject.toml file.

  • requires: Required modules/packages to build the project.
  • buildBackend: The build backend for the project.

manifest:

A list of lines to include in the MANIFEST.in file for the Python package.

license:

A URL pointing to the license text to bundle with the Python package.

Caveats/Known Issues

Type Translation

Because Java is statically-typed, but Python is not, translation of Java types to Python types is not perfect. Additionally, not all Java types are seamlessly interchangeable with Python types. For example:

  • The Java Collection type is translated to a Python Iterable, although these types are not wholly interchangeable.
  • Generics are not fully translated. This could be attempted with the TypeVar class (available in Python's typing module), however, I did not pursue this given that it would become quite completed with Java classes that contain several generic methods.
  • Python has no direct equivalent to Java's char type, which has consequences. For example, consider the following two overloaded methods from the org.bukkit.ChatColor class:
    @Nullable
    public static ChatColor getByChar(char code) {
        ...
    }
    
    @Nullable
    public static ChatColor getByChar(@NotNull String code) {
        ...
    }
    DocsTranslator translates these to:
    @overload
    @staticmethod
    def getByChar(code: str) -> "ChatColor":
        ...
    
    @overload
    @staticmethod
    def getByChar(code: str) -> "ChatColor":
        ...
    The translated Python functions are identical, even though they are not identical in Java. DocsTranslator translates char to str, and, as a consequence, these two translated functions accept the same parameters.
  • Other examples of imperfect type translation exist. See the TypeUtils class for a better idea on how type translation is handled.

JavaDoc Translation

Because JavaDoc strings are ultimately converted into HTML when generating JavaDocs for a Java project, usage of HTML tags/elements are allowed when writing JavaDoc strings. This presents a problem when translating JavaDoc strings to Python docstrings because Python docstrings, unlike JavaDoc strings, do not natively support HTML tags/elements. There are some docstring parsers that use markdown

Based on my own testing, it seems that the Pylance extension in VSCode, the IDE I am using to assess translation quality, interprets some (but not all) Markdown syntax. Therefore, I have attempted to translate as much as I can from HTML to Markdown, however, some HTML tags remain untranslated, namely:

  • HTML tags pertaining to tables: <table>, <th>, <tr>, <td>, etc.
  • HTML heading tags: <h1>, <h2>, <h3>, etc.
  • Other miscellaneous tags: <blockquote>, and more

Building DocsTranslator

Building requires Maven and Git. Maven 3+ is recommended for building the project. Follow these steps:

  1. Clone the repository: git clone https://github.com/magicmq/docs-translator.git
  2. Enter the repository root: cd docs-translator
  3. Build with Maven: mvn clean package
  4. Built files will be located in the target directory.

Note: Maven shades Jython and some other runtime dependencies into the final JAR file. The shaded JAR is docs-translator-{VERSION}.jar, not original-docs-translator-{VERSION}.jar.

Contributing

Any contributions you make to DocsTranslator are greatly appreciated.

If you have a suggestion or modification that would make DocsTranslator better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

About

A Java application that translates documented Java source code into documented Python source code.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages