Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature suggestion: GenericParser mixin #2330

Open
gethvi opened this issue Mar 8, 2023 · 7 comments
Open

Feature suggestion: GenericParser mixin #2330

gethvi opened this issue Mar 8, 2023 · 7 comments

Comments

@gethvi
Copy link
Contributor

gethvi commented Mar 8, 2023

On second thought after implementing #2329 I think the implementation of time_format parameter (and possibly other parameters used in generic parsers) could be a reasonable base for a new GenericParser mixin. Therefore it would be only implemented once. (I realized this now because I copied the if condition for the PR from HTML table parser and broke the DRY rule.)

What do you think @sebix @kamil-certat ?

@gethvi
Copy link
Contributor Author

gethvi commented Mar 8, 2023

Other parameters possibly suitable for GenericParser mixin is type_translation, default_url_protocol, ignore_values, data_type.

@sebix
Copy link
Member

sebix commented Mar 8, 2023

If we have a GenericParser Mixin with the five parameters/features listed by you, how many could use this mixin == use all of these parameters?
Which functionality would the Mixin implement?

@gethvi
Copy link
Contributor Author

gethvi commented Mar 8, 2023

At least the validation of the parameters.

Yes it would be used only for a few bots such as HTML Table, CSV, possibly JSON and KeyValue parser. But still, it's less code without repeating it.

@kamil-certat
Copy link
Contributor

Hey,
I'm unsure about the implementation - mixin, a module with functions handling conversion or even descriptors - but such things should be definitely extracted. But to be honest, in your case - why you didn't use DateTime.convert(...)? I think this is a static method intended to be used to avoid code duplication. You could eventually think about a helper or so that raises the InvalidArgument when the format is wrong (or convert method failed, this is also a pythonic way 'first try, then handle'). Or maybe I have missed something?

However, I would also point out, that we may think about switching from custom parsing on everything, to using e.g. marshmallow for schema validation.

@gethvi
Copy link
Contributor Author

gethvi commented Mar 9, 2023

Thank you, you are right, using the DateTime.convert() is definitely the better way to do it. I will update the PR.

Frankly, I just noticed that HTML Table parser uses TIME_CONVERSIONS from DateTime and CSV parser uses it's own TIME_CONVERSIONS. So I made it the same way as HTML Table (it was the most straightforward solution). Quick and (unfortunately) dirty.

As for using the marshmallow I was thinking the same thing that such thing would be really nice to have. However I was considering pydantic (which I think would also sit better with the IntelMQ API).

@sebix
Copy link
Member

sebix commented Mar 9, 2023

Yes it would be used only for a few bots such as HTML Table, CSV, possibly JSON and KeyValue parser.

But they don't use all of these parameters. Even if one Mixin provides these parameters, the parsers themselves would again need to remove some of them.

But still, it's less code without repeating it.

I fully comply with this goal :)

@gethvi
Copy link
Contributor Author

gethvi commented Mar 9, 2023

It was just a suggestion of parameters I think could be useful for all generic parsers. So yes, they don't use them know, but I think they could in the future and it would be without adding any code, just moving it around a bit. :) (probably even reducing the amount of code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants