New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
load all datasets into memory #190
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained
manager/manager.go, line 217 at r1 (raw file):
// FetchAnnotator returns the annoator in memory based on filename. func (am *AnnotatorMap) FetchAnnotator(key string) (api.Annotator, error) {
Please just modify GetAnnotator instead of introducing a new function.
Pull Request Test Coverage Report for Build 1836
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 143 at r2 (raw file):
if fileDate.Before(GeoLite2StartDate) { // temporary hack to avoid legacy if !GeoLegacyRegex.MatchString(file.Name) && !GeoLegacyv6Regex.MatchString(file.Name) {
Not clear why this is necessary. Add comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 143 at r2 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
Not clear why this is necessary. Add comment?
Done.
manager/manager.go, line 217 at r1 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
Please just modify GetAnnotator instead of introducing a new function.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
manager/manager.go, line 227 at r3 (raw file):
// GetAnnotator returns the correct annotator to use for a given timestamp. func GetAnnotator(request *api.RequestData) (api.Annotator, error) {
I think this is a regression. I changed all the APIs to use the date, instead of a request. Why change it back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 35 at r3 (raw file):
// To speed up online matching, we build another directory for IPv6 as well. var datasetDirV6 = &directory{} var datasetDirLockV6 sync.RWMutex
Might be better to use a single lock to protect both, and update them together.
geoloader/filename.go, line 65 at r3 (raw file):
type dateEntry struct { date time.Time filenames []string
I guess you only need a single filename now.
I'm also thinking it would be better to store the GeoIP directly here as well. That is a bigger refactoring, so maybe better to do it in a separate PR.
geoloader/filename.go, line 136 at r3 (raw file):
func UpdateArchivedFilenames() error { old := getDirectoryV4() size := len(old.dates) + 2
Should use the max of v4 and v6 here, or use v4 to size v4, and v6 to size v6.
geoloader/filename.go, line 170 at r3 (raw file):
} if !fileDate.Before(GeoLite2StartDate) && !GeoLite2Regex.MatchString(file.Name) {
This sequence of if statements is becoming hard to reason about.
geoloader/filename.go, line 183 at r3 (raw file):
dirV6.Insert(fileDate, file.Name) } DatasetFilenames = append(DatasetFilenames, file.Name)
This will do the wrong thing when Update... is called again. Put some more thought into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 35 at r3 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
Might be better to use a single lock to protect both, and update them together.
Done.
geoloader/filename.go, line 65 at r3 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
I guess you only need a single filename now.
I'm also thinking it would be better to store the GeoIP directly here as well. That is a bigger refactoring, so maybe better to do it in a separate PR.
Done.
geoloader/filename.go, line 136 at r3 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
Should use the max of v4 and v6 here, or use v4 to size v4, and v6 to size v6.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 170 at r3 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
This sequence of if statements is becoming hard to reason about.
This is to discard all dataset that is after cutoff date, but not Geolite2, since we download legacy datasets as well after cutoff date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 183 at r3 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
This will do the wrong thing when Update... is called again. Put some more thought into this.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
manager/manager.go, line 227 at r3 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
I think this is a regression. I changed all the APIs to use the date, instead of a request. Why change it back?
because we need info of IPv4 or IPv6 to select an annotator now unfortunately :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 135 at r4 (raw file):
return err } DatasetFilenames = []string{}
I think you need to protect DatasetFilenames with the mutex, too.
Suggest you make it local, and access it with an exported access function.
Probably should then create all the slices, then take the lock and replace them all at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 135 at r4 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
I think you need to protect DatasetFilenames with the mutex, too.
Suggest you make it local, and access it with an exported access function.Probably should then create all the slices, then take the lock and replace them all at once.
BTW, it would be nice to have a unit test that shows this race condition. Let me know if you want help creating one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 1 LGTMs obtained
geoloader/filename.go, line 135 at r4 (raw file):
Previously, gfr10598 (Gregory Russell) wrote…
BTW, it would be nice to have a unit test that shows this race condition. Let me know if you want help creating one.
Done.
This change is