Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the memory usage of string vertex map #1151

Merged
merged 1 commit into from
Jan 12, 2023

Conversation

sighingnow
Copy link
Member

What do these changes do?

  • Extending the shared hashmap as relocateble to support string_view as the ky
  • Adding a BufferOrEmpty() method to return valid shared pointers even the blob itself is empty
  • Refactor vertexmap implementation to avoid duplication and reduce the duplicating in-process hashmap to save memory.

Related issue number

N/A

@github-actions
Copy link
Contributor

✅ Doc deploy preview ready: https://deploy-preview-pr-1151--v6d.netlify.app

@sighingnow sighingnow force-pushed the ht/vm-string-opt branch 3 times, most recently from 80cf477 to 5ef7f4e Compare January 12, 2023 13:22
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
@sighingnow sighingnow merged commit 3b28328 into v6d-io:main Jan 12, 2023
@sighingnow sighingnow deleted the ht/vm-string-opt branch January 12, 2023 13:57
sighingnow added a commit to alibaba/GraphScope that referenced this pull request Jan 14, 2023
## What do these changes do?

- Expose the `retain_oid` options to Python
- Upgrade vineyard to latest version to handle the memory usage issues
- Adding `errors="ignore"` to avoid decoding/encoding errors
- Enable `int32_t` OID type support

## Benchmarking

Settings:
- 1 worker
- loading from CSV (710MB)
  - nodes: 330MB, 21865475
  - edges: 380MB, 11089373
- workload: loading (`generate_eid=False`) -> project -> pagerank

Memory usage:
- `int64_t` oid:
   - `retain_oid=True`:
      - after loading graph, peak: 3.53GB
      - after project, peak: 3.72GB
      - after running the application, peak: 4.49GB
   - `retain_oid=False`:
      - after loading graph, peak: 3.33GB
      - after project, peak: 3.53GB
      - after running the application, peak: 4.29GB
- `string` oid: graph: , peak: 
  - `retain_oid=True`:
      - after loading graph, peak: 5.16GB
      - after project, peak: 5.18GB
      - after running the application, peak: 6.00GB
   - `retain_oid=False`:
      - after loading graph, peak: 4.68GB
      - after project, peak: 4.70GB
      - after running the application, peak: 5.51GB


## Related issue number

Fixes #2127
Fixes #2269 

See also: v6d-io/v6d#1151

Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants