-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index geo payload #366
index geo payload #366
Conversation
8322560
to
3be36bd
Compare
WDYT degarding splitting of |
@generall I agree with you sentiment, I will split the geohash functions from the geo indexing 👍 |
3be36bd
to
dce84a7
Compare
8634f4a
to
fec7d38
Compare
908eb0c
to
257466a
Compare
214ae9d
to
b8895be
Compare
values_count | ||
} else { | ||
// between 0 and 1 (there is at least one payload per point) | ||
let payload_per_point = total_points_count as f64 / self.payload_count as f64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be called point_per_payload
, isn't it?
} else { | ||
// between 0 and 1 (there is at least one payload per point) | ||
let payload_per_point = total_points_count as f64 / self.payload_count as f64; | ||
values_count.saturating_sub((values_count as f64 * payload_per_point) as usize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could be used as some approximation to expected value, but not min
primary_clauses: vec![], | ||
min, | ||
exp: expected_count, | ||
max: expected_count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But max
!= expected
in this case. For max
we should guarantee, that it is impossible to match more than this amount of points with this query
values_count.saturating_sub((values_count as f64 * payload_per_point) as usize) | ||
}; | ||
// don't overflow max number of points | ||
let expected_count = total_points_count.min(values_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
encode((lon, lat).into(), GEOHASH_MAX_LENGTH) | ||
} | ||
|
||
pub fn geo_hash_to_box(geo_hash: &GeoHash) -> GeoBoundingBox { | ||
let rectangle = decode_bbox(geo_hash).unwrap(); | ||
let top_left = GeoPoint { | ||
lat: rectangle.max().x, | ||
lon: rectangle.min().y, | ||
lat: rectangle.max().y, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mapping x/y to lat/lon is so error prone 😞
if n < 1.0 { | ||
return 1.0; // By definition | ||
} | ||
(2. * PI * n).sqrt().ln() + n * (n / E).ln() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the std lib defines also TAU
that we could use here
/// The full circle constant (τ)
///
/// Equal to 2π.
#[stable(feature = "tau_constant", since = "1.47.0")]
pub const TAU: f64 = 6.28318530717958647692528676655900577_f64;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@generall thank you very much for pushing this to the end line!
The PR is good to go from my perspective, there are only minor comments remaining.
I can't approve officially this PR in Github as I have created it 🙃
This PR adds a new indexing scheme for geo payloads #362
Notes:
geo_index
file has been renamed togeo_hash
to make space for the newPersistedGeoMapIndex