-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.service
subpkg
#483
.service
subpkg
#483
Conversation
piker/data/cli.py
Outdated
if delete: | ||
for fqsn in symbols: | ||
syms = await storage.client.list_symbols() | ||
breakpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left in a breakpoint unintentionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ish, was WIP, fixed now (ish) via 6a0ae58
@@ -322,37 +354,94 @@ async def open_ahabd( | |||
) = ep_func(client) | |||
cntr = Container(dcntr) | |||
|
|||
with trio.move_on_after(start_timeout): | |||
found = await cntr.process_logs_until(start_lambda) | |||
conf: ChainMap[str, Any] = ChainMap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why use a ChainMap here instead of a dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's easier then writing a lot of dict.get(blah, default)
stuff and makes everything extensible for if we ever need to offer overrides of container configs for specific use cases..
if you haven't read it yet, check out the docs on this type btw, this is kinda why it was added to stdlib afaiu 🏄🏼
@@ -135,7 +135,7 @@ def start_marketstore( | |||
|
|||
# create dirs when dne | |||
if not os.path.isdir(config._config_dir): | |||
Path(config._config_dir).mkdir(parents=True, exist_ok=True) | |||
Path(config._config_dir).mkdir(parents=True, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ws?
@@ -659,7 +662,7 @@ async def tsdb_history_update( | |||
# - https://github.com/pikers/piker/issues/98 | |||
# | |||
profiler = Profiler( | |||
disabled=False, # not pg_profile_enabled(), | |||
disabled=True, # not pg_profile_enabled(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is default true on the Profiler purposeful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm once ci is passing bruddr
🙏🏼 to da CI gawdz |
nothing like hanging for 40 mins |
Adds a `piker storage` subcmd with a `-d` flag to wipe a particular fqsn's time series (both 1s and 60s). Obviously this needs to be extended much more but provides a start point.
With the addition of a new `elastixsearch` docker support in #464, adjustments were made to container startup sync logic (particularly the `trio` checkpoint sleep period - which itself is a hack around a sync client api) which caused a regression in upstream startup logic wherein container error logs were not being bubbled up correctly causing a silent failure mode: - `marketstore` container started with corrupt input config - `ahabd` super code timed out on startup phase due to a larger log polling period, skipped processing startup logs from the container, and continued on as though the container was started - history client fails on grpc connection with no clear error on why the connection failed. Here we revert to the old poll period (1ms) to avoid any more silent failures and further extend supervisor control through a configuration override mechanism. To address the underlying design issue, this patch adds support for container-endpoint-callbacks to override supervisor startup configuration parameters via the 2nd value in their returned tuple: the already delivered configuration `dict` value. The current exposed values include: { 'startup_timeout': 1.0, 'startup_query_period': 0.001, 'log_msg_key': 'msg', }, This allows for container specific control over the startup-sync query period (the hack mentioned above) as well as the expected log msg key and of course the startup timeout.
Previously we would make the `ahabd` supervisor-actor sync to docker container startup using pseudo-blocking log message processing. This has issues, - we're forced to do a hacky "yield back to `trio`" in order to be "fake async" when reading the log stream and further, - blocking on a message is fragile and often slow. Instead, run the log processor in a background task and in the parent task poll for the container to be in the client list using a similar pseudo-async poll pattern. This allows the super to `Context.started()` sooner (when the container is actually registered as "up") and thus unblock its (remote) caller faster whilst still doing full log msg proxying! Deatz: - adds `Container.cuid: str` a unique container id for logging. - correctly proxy through the `loglevel: str` from `pikerd` caller task. - shield around `Container.cancel()` in the teardown block and use cancel level logging in that method.
For now just moves everything that was in `piker._daemon` to a subpkg module but a reorg is coming pronto!
Not really sure there's much we can do besides dump Grpc stuff when we detect an "error" `str` for the moment.. Either way leave a buncha complaints (como siempre) and do linting fixups..
Thanks @esme! XD Also, do a linter pass and remove a buncha unused references.
Due to making ahabd supervisor init more async we need to be more tolerant to mkts server startup: the grpc machinery needs to be up otherwise a client which connects to early may just hang on requests.. Add a reconnect loop (which might end up getting factored into client code too) so that we only block on requests once we know the client connection is actually responsive.
Provides a more correct solution (particularly for distributed testing) to override the `piker` configuration directory by reading the path from a specific `tractor._state._runtime_vars` entry that can be provided by the test harness. Also fix some typing and comments.
Needed to move the startup sequence inside the `try:` block to guarantee we always do the (now shielded) `.cancel()` call if we get a cancel during startup. Also, support an optional `started_afunc` field in the config if backends want to just provide a one-off blocking async func to sync container startup. Add a `drop_root_perms: bool` to allow persisting sudo perms for testing or dyanmic container spawning purposes.
THERE. moved all oustanding bullets to new follow up issues 🏄🏼 So whoever decides to review this once more and decides we gud, please merge. |
Heh, i can tell y'all aren't testing non-disti mode.. 😂 |
This broke non-disti-mode actor tree spawn / runtime, seemingly because the cli entrypoint for a `piker chart` also sends these values down through the call stack independently? Pretty sure we don't need to send the `enable_modules` from the chart actor anyway.
Move all actor-service control APIs and the service manager (basically all the stuff that was in
piker._daemon
) into a newpiker.service
sub package; this will serve (get it..) as the high-level actor-service orchestration sub-system, the business logic forpiker
's distributed architecture and runtime deployment.Of note,
.service._ahab
which repair a regression: a silent startup failure withmarketstore
due to the log processing poll sleep period change..ahabd
supervisor taskpikerd
logging correctly to the docker super code in general.._ahab.py
docker supervisor into the sub-package as well since it is literally an external, containerized service supervisor 😂.data.elastix
and.marketstore
mods here as well until we figure out how to formalize our storage layer subsystems, and since most of that code is related to db interaction/mangement vs. data processing.ToDo:
extend the tests for both dbs to have a client connect and do something or other..add cancellation tests for the dockerized services=> moved these to a follow up testing task
ahabd
supervision tests #487._daemon.py
module in each sub-system package?.service._daemon
eg.=> moved to How to expose/organizepikerd.data._daemon
contains the routines for spawning thebrokerd
actor?<daemon>d
entryoints? #488Services
api into it's own module?_runtime.py
? (** chose.service._actor_runtime.py
instead **)move the=> also moved to How to expose/organize._exec.py
stuff (which is more or less just Qt runtime startup) into this pkg as well?<daemon>d
entryoints? #488tmp_dir
fixure for our config dir path during testingtractor._state._runtime_vars