-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't use StripTransform in .yml file #391
Comments
The way I've handled this in the past is to add the module arg when invoking a reader/transform/writer that's not already imported. i.e.
|
Hi Mike!
Good catch. This is basically me being inconsistent in design philosophy
and driving down the middle of the road. When I originally wrote listen.py,
I had it import every reader/writer/transform under the sun. Then, as
transforms multiplied, I felt like it started to get cumbersome - should it
really include everything, even though most of it won't ever be used? How
much does that slow down startup (importing, reading, parsing all that
extra code)? So figured that, beyond the original set, I'd just count on
folks specifying where each component lived, using the 'module'
specification, as Webb describes above.
But I realize I've not made that explicit in the documentation.
Would love your thoughts (and Webb's) on which way to go on this.
…On Sun, May 19, 2024 at 4:39 AM Webb Pinner ***@***.***> wrote:
The way I've handled this in the past is to add the module arg when
invoking a reader/transform/writer that's not already imported. i.e.
transforms:
- class: SplitTransform
module: logger.transforms.split_transform
—
Reply to this email directly, view it on GitHub
<#391 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFO7V3UKEZ5KX7CO34YGHYTZDCFPFAVCNFSM6AAAAABH6E653KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGIYDIMJVGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Would something like this work:
Then listen.py would just:
This approach would require any new readers/transforms/writers to be added to their corresponding This doesn't take into account performance ramifications. |
That hadn't occurred to me at first. In my mind that was just for local additions, like |
My kneejerk reaction is that I'd hate to have to specify the module all over the place in the .yml files for "built in" transforms, readers, or writers. (I haven't tested this, do I need to do it everywhere or just the first occurnce?) I feel like as a writer of a .yml file, I shouldn't have to know in advance which modules are imported for me already... I should be able to assume all the supplied modules are ready to use. I like the idea of controlling it via the Regarding performance, I'm unsure, but loading modules that don't have much/any init code in them shouldn't be that impactful. I haven't bumped into any dynamic code that runs at module load-time in any of the files of poked at. Once everything gets loaded once, and .pyc files created, I'd view it as as close to free as you can get with Python. Now, if we start trying to dynamically load things by globbing on the filesystem, that would get slow (although again, I don't know how slow). So, I personally think we should either just add the missing imports to listen.py (with the intent to always add new imports three when we add new classes) or do it in a series of |
Okay - I'm going to have a go at implementing the __init__.py solution.
Will keep you posted.
…On Sun, May 19, 2024 at 1:48 PM Michael D Labriola ***@***.***> wrote:
Would something like this work: Use the reader/transforms/writers/
__init__.py files to import all the readers/transforms/writers. i.e.
#logger.readers.__init__.py
from . import cached_data_reader
from . import logfile_reader
from . import network_reader
from . import tcp_reader
from . import udp_reader
from . import redis_reader
from . import serial_reader
from . import text_file_reader
from . import database_reader
from . import composed_reader
Then listen.py would just:
import readers from logger.readers
import transforms from logger.transforms
import writers from logger.writers
This approach would require any new readers/transforms/writers to be added
to their corresponding __init__.py files when they are added to the
repository but other files wanting to import all available
readers/transforms/writers wouldn't have to change.
This doesn't take into account performance ramifications.
My kneejerk reaction is that I'd hate to have to specify the module all
over the place in the .yml files for "built in" transforms, readers, or
writers. (I haven't tested this, do I need to do it everywhere or just the
first occurnce?) I feel like as a writer of a .yml file, I shouldn't have
to know in advance which modules are imported for me already... I should be
able to assume all the supplied modules are ready to use.
I like the idea of controlling it via the __init__ files instead of
groups of imports in listen.py, if that can be done with minimal effort.
I actually was surprised to see them all empty w/out any import or prep
code in them.
Regarding performance, I'm unsure, but loading modules that don't have
much/any init code in them shouldn't be that impactful. I haven't bumped
into any dynamic code that runs at module load-time in any of the files of
poked at. Once everything gets loaded once, and .pyc files created, I'd
view it as as close to free as you can get with Python. Now, if we start
trying to dynamically load things by globbing on the filesystem, that would
get slow (although again, I don't know *how* slow).
So, I personally think we should either just add the missing imports to
listen.py (with the intent to always add new imports three when we add new
classes) or do it in a series of __init__ files, whichever is easier. I
should mention, I did not double-check Reader/Writers when I came up with
that list of missing Transforms.
—
Reply to this email directly, view it on GitHub
<#391 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFO7V3WQFT6SFYTGDJUSISLZDEFZ7AVCNFSM6AAAAABH6E653KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGM2TKNBSHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Cool! FYI, I haven't noticed any performance difference explicitly importing all the transform modules via listen.py |
The one change I've had to make (branch issue_391) is that instead of Webb's
from . import cached_data_reader
from . import logfile_reader
I'm having to do
from .cached_data_reader import CachedDataReader
from .composed_reader import ComposedReader
And in listen.py I've gone with
from logger.readers import *
Otherwise, in the code (and configs) you have to invoke it as
cached_data_reader.CachedDataReader. I don't see a clean way to do it
without "*" imports.
Do either of you have ideas on a cleaner way to do it, or shall I go with
this?
…On Tue, May 21, 2024 at 2:56 PM Michael D Labriola ***@***.***> wrote:
Okay - I'm going to have a go at implementing the *init*.py solution.
Will keep you posted.
Cool! FYI, I haven't noticed any performance difference explicitly
importing all the transform modules via listen.py
—
Reply to this email directly, view it on GitHub
<#391 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFO7V3WO3ZQKRBW5CBIAJGLZDO7IVAVCNFSM6AAAAABH6E653KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTGUYDINJTG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I was afraid of that. I did a quick little test myself the other day, couldn't get it right, and just figured my skills were rusty. If we can't get "*" imports working, and we're left having to invoke as cached_data_reader.CachedDataReader, that's going to break every yaml config file in the universe and make them all that much harder to read... I'd rather live with having the list of import statements in the not-obvious |
It works fine with "from logger.readers import *" - it's just that
importing "*" is considered poor form.
…On Tue, May 21, 2024 at 4:12 PM Michael D Labriola ***@***.***> wrote:
The one change I've had to make (branch issue_391) is that instead of
Webb's from . import cached_data_reader from . import logfile_reader I'm
having to do from .cached_data_reader import CachedDataReader from
.composed_reader import ComposedReader And in listen.py I've gone with from
logger.readers import * Otherwise, in the code (and configs) you have to
invoke it as cached_data_reader.CachedDataReader. I don't see a clean way
to do it without "*" imports. Do either of you have ideas on a cleaner way
to do it, or shall I go with this?
I was afraid of that. I did a quick little test myself the other day,
couldn't get it right, and just figured my skills were rusty.
If we can't get "*" imports working, and we're left having to invoke as
cached_data_reader.CachedDataReader, that's going to break every yaml
config file in the universe and make them all that much harder to read...
I'd rather live with having the list of import statements in the
not-obvious listen.py.
—
Reply to this email directly, view it on GitHub
<#391 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFO7V3XQHMOKQQDUTQT46FLZDPIGTAVCNFSM6AAAAABH6E653KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTGU3TMMZVGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I misunderstood. This seems like a perfectly legit use case for import * to me. I'd just do it. :-) |
Will do! Running tests now...
…On Tue, May 21, 2024 at 5:19 PM Michael D Labriola ***@***.***> wrote:
It works fine with "from logger.readers import *" - it's just that
importing "*" is considered poor form.
I misunderstood. This seems like a perfectly legit use case for import *
to me. I'd just do it. :-)
—
Reply to this email directly, view it on GitHub
<#391 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFO7V3UAV5TP7N5L35QTNILZDPQBDAVCNFSM6AAAAABH6E653KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTGYZDSNJXHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ahoy @davidpablocohn!
I just realized I cannot use a StripTransform in v1.9.0. It looks like it's never imported by listen.py. A quick audit revealed the following lines appear to be missing from the top of listen.py... Any reason any of these were excluded (e.g., they don't actually work)?
I've imported all of them on my server and things seem to work, although I've honestly only tested StripTransform.
The text was updated successfully, but these errors were encountered: