Skip to content

Design Document for Python Script Generator#1135

Closed
juliale-15 wants to merge 2 commits intoapache:masterfrom
juliale-15:master
Closed

Design Document for Python Script Generator#1135
juliale-15 wants to merge 2 commits intoapache:masterfrom
juliale-15:master

Conversation

@juliale-15
Copy link
Contributor

@A-Postl and I created this first version of a design document for the python script generator. We would appreciate feedback if our planned approach could work like this or not.

Copy link
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

there are some minor locations that i want to change, for the files but other than that, it sounds nice.

You might want to consider making it such that the thing you parse first is the DML defined function and return (Defined on line 53-56 in this document). This is what the program reacts on and not the documentation above. Then after this use that definition of function to verify that the comment above is aligned with the function definition.
This would be very useful in our test suite as well, to enforce a scheme for future function definitions, or maybe later automated correction of the documentation.


## Generator Folder Structure

We plan on creating a folder called `generator` in `src/main/python/systemds`. In this folder we will place our `generator.py`, a `parser.py` and a folder called `scripts/builtin` where we will place the generated python scripts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The folder src/main/python/systemds is part of the distributed pip package afterwards, i think it would be better if this generating script is not.
Therefore making a the neww folder generator directly on the src/main/python path would be good.

:param eps: Tolerance for the algorithm to declare convergence using WCSS change ratio.
:param is_verbose: Boolean flag if the algorithm should be run in a verbose manner.
:param avg_sample_size_per_centroid: The average number of records per centroid in the data samples.
:return: `OperationNode` List containing two outputs 1. the clusters, 2 the cluster ID associated with each row in x.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the multi return is not nice as it is, so if you have better suggestions feel free to design it better.


1. `src/main/python/systemds/generator/generator.py`
2. `src/main/python/systemds/generator/parser.py`
3. `src/main/python/systemds/generator/scripts/builtin/` No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like the scripts that are generated to be located in:

systemds/operator/algorithm/builtin/kmeans.py

Then furthermore when they are located there, make it so that the init.py in systemds/operator/__init__.py loads in all the builtin functions generated.

@Baunsgaard
Copy link
Contributor

Closed and merge, since you already have a PR for the implementation.
the only change i did was move the file to a different folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants