Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FE: Implement FE service #73

Merged
merged 5 commits into from
Mar 21, 2018
Merged

FE: Implement FE service #73

merged 5 commits into from
Mar 21, 2018

Conversation

smacker
Copy link
Contributor

@smacker smacker commented Mar 16, 2018

Part of #51

Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@bzz bzz mentioned this pull request Mar 16, 2018
7 tasks
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@smacker smacker changed the title [WIP] FE: Implement FE service FE: Implement FE service Mar 20, 2018
@carlosms
Copy link
Contributor

Should we add the python tests to travis in this PR?

@smacker
Copy link
Contributor Author

smacker commented Mar 20, 2018

@carlosms what do you mean? they are added. Do you want to remove them? (I don't)

@carlosms
Copy link
Contributor

My bad, I scanned the PR quickly thought your changes were testing the docker build only.

.travis.yml Outdated
@@ -54,6 +54,12 @@ matrix:
env: STYLE_CHECK=true
script: ./sbt scalastyle

- scala: 2.11.2
script:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's minor, but for convenience and consistency, could you please add env section to this build with something descriptive and human-readable, like FE_PYTHON_TEST=true ?

It does not mean much yet, but is super-helpful with TravisCI, when a project has many profiles, to quickly identify the purpose of the failing one.

Like i.e here one would be able to tell that it's only Integration Tests failing for this job.
screen shot 2018-03-21 at 3 05 06 pm

@bzz
Copy link
Contributor

bzz commented Mar 21, 2018

Looks awesome, a quick question - is there a reproducible way how one can generate/update the fixtures for the FE?

"""Extract identifiers weighted set"""

extractor = IdentifiersBagExtractor(
docfreq_threshold=request.docfreqThreshold, split_stem=request.splitStem)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, all BagsExtractors, of which IdentifiersBagExtractor is a particular kind, have a weight parameter - do you think it should be added on our side as well?

AFAIK it's something that most probably going to be used as part of parameter adjustment as soon as the values are inferred from a labeled data, provided by 🐈 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. I'm not sure how I missed it in the task about creating proto files. My bad.

Do you mind if I add this parameter in a separate PR? It should be in all extractors and requires changes in proto & scala.

"""Extract uast2seq weighted set"""

extractor = UastSeqBagExtractor(
docfreq_threshold=request.docfreqThreshold)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same situation with arguments of the base algorithm class here , allthough these guys token2index and token_parser class name might be of less value right now and could be added later, as soon as they are really needed.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found only 1 implementation for token parser and token index in ml lib. So I decided, for now, we don't need additional parameters there.

if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Feature Extractor Service.')
parser.add_argument("--port", type=int, default=9001,
help="server listen port")
args = parser.parse_args()

# sourced-ml expects PYTHONHASHSEED != random or unset
if os.getenv("PYTHONHASHSEED", "random") == "random":
Copy link
Contributor

@bzz bzz Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, these are src-d/ml implementation details, relevant to some extractors.

What do you think - on our side, would it be wise to be more careful and make put them as part of the API and pass them as explicit parameters of RPC request?

This could help to decouple a library API with it's runtime environment, which is a good practice anyway.

Copy link
Contributor Author

@smacker smacker Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. This parameter changes how python interpreter works. You can’t change it in runtime. That’s why the check is in __main__.

Copy link
Contributor

@bzz bzz Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Reading https://docs.python.org/3.3/using/cmdline.html#envvar-PYTHONHASHSEED helped to understand it better, which would be nice to have somewhere (i.e in that comment above the code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a comment already. 2 lines below.

Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@smacker
Copy link
Contributor Author

smacker commented Mar 21, 2018

You can generate a fixture using sdk. The command for it was added here bblfsh/sdk#249

all comments were addressed (by comments). @bzz could you please take another look? I see only 1 problem: weight param. But I strongly prefer to add it in separate PR to avoid conflicts.

@bzz
Copy link
Contributor

bzz commented Mar 21, 2018

You can generate a fixture using sdk. The command for it was added here bblfsh/sdk#249

Really, really nice, thank you for linking 👍 ! I only wish it could be documented somewhere, i.e same as code generation from .proto, so next person has to update the tests or something like that - it saves time for a him/her.

but I strongly prefer to add it in separate PR to avoid conflicts.

sounds good to me

Copy link
Contributor

@bzz bzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sans very minor documentation issue on test fixture generation.

@smacker
Copy link
Contributor Author

smacker commented Mar 21, 2018

I'm not sure about the rightest place to put info about fixtures for now.

I believe I would need fixtures for integration test scala-python (last task in umbrella issue), it might change the location of fixtures or I'll create some script. I'll document how to generate fixtures in that task when figure out the best way to do it.

@smacker smacker merged commit 28277f3 into src-d:master Mar 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants