-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
location fallback #263
location fallback #263
Conversation
Related #262 |
hmm tried to fix the |
seems to be a breaking change with tox4.0 - https://tox.wiki/en/4.0.3/faq.html#tox-4-changed-ini-rules I think I've fixed the passenv issue then, wasnt sure if it was passenv causing the tests to fail before but tests still dont run -- looks like it just installs and then runs no more commands
|
lat: float | ||
lon: float | ||
dt: datetime | ||
accuracy: Optional[float] | ||
elevation: Optional[float] | ||
datasource: Optional[str] = None # which module provided this, useful for debugging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be kinda cool to have it for every data provider.. but haven't thought too much how to make it generic yet. In some cases possible to extract from the full class name, but only if inheritance is involved I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, something like a helper that adds a field to every NT/dataclass which could be used in all.py
, but those tend to be readonly, so would likely need to do some sort of weird non-mypy-compliant magic with a proxy object which likely breaks cachew etc..., so not sure how automatic could be
@dataclass
class ObjProxy:
_obj_proxy: Any
datasource: Optional[str] = None
def getattr(self, name: str) -> Any:
return getattr(self._obj_proxy, name)
... doesnt seem great
For locations in particular I thought it'd be useful incase youre trying to exclude a faulty/wrong location geolocated from ip addresses for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be kinda cool to have it for every data provider.. but haven't thought too much how to make it generic yet.
Apologies for the poor quality of this reply. I’m not feeling well and as much as I try I can’t understand the code that’s being worked on right now. I’m adding this in case it helps the project but I understand if it doesn’t fit.
When I was putting thought in to building something similar to HPI I was targeting a graph database so that each item would have (Item)->(Metadata) such as (jpg file)->(EXIF size), (jpg file)-(EXIF date), and so-on. That way each piece of metadata could have source (eg: “Josh’s iPhone”) and accuracy properties. In that scenario correcting the record was a matter of adding the new metadata node with a source of “User: Josh” and an accuracy of whatever they decided.
Because this type of correction is non-destructive there was just logic that pulled all sources of metadata from the API then displayed the user-defined source as the primary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The graph database idea makes a lot of sense as a solution to structuring this data, but HPI in general tends to denormalize databases into single dataclasses/namedtuples so that its able to be cached with cachew
and just easier to manage as an Iterator
instead of downstream functions (other functions, hpi query
) having to know the specifics of every API
Still not sure what my plan is on a separate repo/how much to put there, but will see if I can get some basic file-based graphdb working (though may be easier to just use a JSON file/sqlitedb instead?)
seems that after version 4.0 it's necessary to specify environments to run previosly it was picking them up automatically
7cf29c8
to
52b6acb
Compare
ok, worked through a lot of the CI errors, current error seems to be related to sqlalchemy on mac? I remember seeing you switching something to do with dataset lib and I did merge master into this branch (to fix tox.ci conflicts) so I should be up to date on any changes you've made Theres also this broken https://github.com/karlicoss/HPI/actions/runs/4250652886/jobs/7392019565#step:5:1983 |
yeah, I've basically added all reasonable configs stubs there when I did the change to tox file, I think that's the best way forward. The only exception is some modules which I don't feel generic enough and might just move to personal overlay later -- for them just did
Not sure what you mean about -qlalchemy, but I think I've seen this error before a couple of times on macos, it's transient, retried the CI for now. I think it's because the expecimental |
Just for posterity (regarding
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, looks good! Hope I get to work more on my locations stuff soon and try it
logger.debug(f"no. of items being used from fallback locations: {len(use_fallback_days)}") | ||
|
||
# combine local_dates and missing days from fallback into a sorted list | ||
all_dates = heapq.merge(local_dates, use_fallback_days, key=lambda p: p.day) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nice, didn't know heapq.merge works on iterators!
@@ -41,17 +41,23 @@ class config(user_config): | |||
# if the accuracy for the location is more than 5km, don't use | |||
require_accuracy: float = 5_000 | |||
|
|||
# how often (hours) to refresh the cachew timezone cache | |||
# this may be removed in the future if we opt for dict-based caching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe we'll have support for cache expiry in cachew -- wouldn't be too hard to add, and kinda useful anyway
👍 will fix the issues and add a test for the sorted ip bisect later today |
doesnt have a 'key' parameter till python3.10
Added a They I think thats everything, will probably do a follow up PR to fix some CI issues/link to my repo as examples for denylist |
otherwise, it seems to use my config from ~/.config/my/my... instead of relying on the defaults in each module
Nice! Happy to merge, but looks like there are still some git conflicts? |
Ah strange GitHub UI didn't give me any indication there was a conflict (though not surprising) Don't have power at my house right now but will do a local merge against master and push the merge commit when I can |
ah sorry! I think we had different idea about conflicts because my default PR merge action is "rebase and merge"! If I choose "Squash and merge", it's good now! |
Ah all good, this has been complex with me merging the master into here a few times, so to be expected Was just looking into squashing the commit myself but if you can just squash it in github ui that would be great |
Same transient |
ah! rushed the merge a bit 😅 either way, since it's transient should be fine |
just basic restructuring/model for now, still a WIP