Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single cache file #145

Merged
merged 4 commits into from
Feb 16, 2021
Merged

Single cache file #145

merged 4 commits into from
Feb 16, 2021

Conversation

belltailjp
Copy link
Member

Currently file caches (FileCache and MultiprocessFileCache) save the cache information separately into index file and data file.
This PR is to propose to change it to single file.

The new cache file is formatted as:

  • index part: (8 bytes sample offset + 8 bytes sample size) * N
  • data part: (arbitrary size of binary data) * N

where N is the number of samples to cache.
This is actually completely equivalent to just concatenating the current index and data file.

Background

More strict cache size limitation (related to #143)

#143 will introduce limitation of cache data, however currently it is limited to data file only.
It is usually OK - as the index files are relatively negligible in most cases (e.g., ImageNet1K case 34GB vs 20MB), but as the number of parallelism grows the index file increases, which becomes relatively non-trivial, but they'll be out of limitation.
By unifying the index and data file into one file, we can constrain the cache size more strictly.

Easier handling

In case we want to fully utilize cache preserve and preload functionality, if the preserved cache is a single file it is easier to handle/manage.

Performance concern

@kuenishi has pointed out that since in a single-file cache, every time we access to the cache it reads the file twice (to get index and actually data), which may increase chance of conflict in multi-process use.
To be studied.

@kuenishi kuenishi self-requested a review February 10, 2021 10:29
@kuenishi
Copy link
Member

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit 9889a9c:

@kuenishi
Copy link
Member

@belltailjp Tests passed. Is this still WIP?

@kuenishi kuenishi added this to the 1.1.0 milestone Feb 15, 2021
@belltailjp
Copy link
Member Author

I wanted to try some micro-benchmarks to see if there's non-negligible performance regressions due to increased chance of file access concurrency in MultiprocessFileCache. 🙇‍♂️

@belltailjp
Copy link
Member Author

https://gist.github.com/belltailjp/eb6a98a799bf1a56bd42132b92db763b#file-result-md
I checked the performance by a simple micro-benchmark.

  • Build a cached file with 1MB/sample * 32K samples (32GB total), then measure loading time by 16, 32, 64 and 128 parallel processes.
  • Same for 32KB/sample * 1M samples

There are certain amount of fluctuations but I think it's OK to say no significant degradation (nor improvement) in performance.

@belltailjp belltailjp changed the title [WIP] Single cache file Single cache file Feb 16, 2021
Copy link
Member

@kuenishi kuenishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, but please fix the conflict against #143 .

@kuenishi
Copy link
Member

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit bb85698:

@kuenishi kuenishi merged commit 4985ce4 into pfnet:master Feb 16, 2021
@kuenishi kuenishi mentioned this pull request Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants