Skip to content

Add a note about what is serialised to file#3075

Merged
benjeffery merged 1 commit into
tskit-dev:mainfrom
hyanwong:data-saving
May 12, 2025
Merged

Add a note about what is serialised to file#3075
benjeffery merged 1 commit into
tskit-dev:mainfrom
hyanwong:data-saving

Conversation

@hyanwong
Copy link
Copy Markdown
Member

This is information that I had to ask about (here). I'm still not 100% sure what is saved into a tree sequence file (e.g. not just the indexes, but various cached properties too, I suspect), so at the moment this doc change says:

When serializing (e.g. storing a {class}TreeSequence to disk). the underlying tables
are stored along with the indexes and other stuff. When the tree sequence is loaded
from file, it is then guaranteed to be valid, with pre-calculated indexes and cached
properties (which ones?) immediately available.

If a {class}TableCollection is saved to file, then any indexes are also stored in the
file. A {class}TableCollection that has been loaded from a file is not, however,
guaranteed to be a valid tree sequence.

Once I have information to fill in the and other stuff bits, I'll take this off draft PR mode

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.58%. Comparing base (6542fc2) to head (c5e8020).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3075   +/-   ##
=======================================
  Coverage   89.58%   89.58%           
=======================================
  Files          28       28           
  Lines       31885    31885           
  Branches     5855     5855           
=======================================
  Hits        28565    28565           
  Misses       1888     1888           
  Partials     1432     1432           
Flag Coverage Δ
c-tests 86.66% <ø> (ø)
lwt-tests 80.38% <ø> (ø)
python-c-tests 88.18% <ø> (ø)
python-tests 98.79% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jeromekelleher
Copy link
Copy Markdown
Member

jeromekelleher commented Dec 16, 2024

The other stuff is pretty minimal:

$ msp ancestry 10 -o tmp.trees
$ kastore ls tmp.trees 
edges/child
edges/left
edges/metadata
edges/metadata_offset
edges/metadata_schema
edges/parent
edges/right
format/name
format/version
indexes/edge_insertion_order
indexes/edge_removal_order
individuals/flags
individuals/location
individuals/location_offset
individuals/metadata
individuals/metadata_offset
individuals/metadata_schema
individuals/parents
individuals/parents_offset
metadata
metadata_schema
migrations/dest
migrations/left
migrations/metadata
migrations/metadata_offset
migrations/metadata_schema
migrations/node
migrations/right
migrations/source
migrations/time
mutations/derived_state
mutations/derived_state_offset
mutations/metadata
mutations/metadata_offset
mutations/metadata_schema
mutations/node
mutations/parent
mutations/site
mutations/time
nodes/flags
nodes/individual
nodes/metadata
nodes/metadata_offset
nodes/metadata_schema
nodes/population
nodes/time
populations/metadata
populations/metadata_offset
populations/metadata_schema
provenances/record
provenances/record_offset
provenances/timestamp
provenances/timestamp_offset
sequence_length
sites/ancestral_state
sites/ancestral_state_offset
sites/metadata
sites/metadata_offset
sites/metadata_schema
sites/position
time_units
uuid

So, tables, top-level metadata, indexes and a handful of properties: sequence_length, format info, sequence_length, time_units and uuid (which we don't really use).

@hyanwong hyanwong marked this pull request as ready for review January 14, 2025 16:34
@benjeffery
Copy link
Copy Markdown
Member

@Mergifyio rebase

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jan 15, 2025

rebase

☑️ Nothing to do

Details
  • any of:
    • #commits > 1 [📌 rebase requirement]
    • #commits-behind > 0 [📌 rebase requirement]
    • -linear-history [📌 rebase requirement]
  • -closed [📌 rebase requirement]
  • -conflict [📌 rebase requirement]
  • queue-position = -1 [📌 rebase requirement]

@benjeffery
Copy link
Copy Markdown
Member

@Mergifyio rebase

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jan 16, 2025

rebase

✅ Branch has been successfully rebased

@benjeffery benjeffery enabled auto-merge May 12, 2025 14:44
@benjeffery benjeffery added this pull request to the merge queue May 12, 2025
Merged via the queue into tskit-dev:main with commit f255bf5 May 12, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants