You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks @1beb for the feature request. I'm planning on adding an fst.rbind method to the next version of the fst package. This method will only need to read some meta-data from the existing file, so appending will be very fast as per your request. Note however that fst uses a columnar binary file format. This means that added data will basically be stored as a separate chunk inside the 'fst' file format. This will have a marginal impact on performance when large chunks of data are appended. However, when many small chunks are added sequentially, the overall performance will suffer. A partial solution to this problem might be to define a fst.stream class (issue #15) which can be used to append data to an existing file through an internal buffer. When the number of chunks is known, you can also use a fst.lapply method to create a large on-disk data set from many smaller inputs (issue #18) (also to be developed) . This could also be done in parallel with a fst.parlapply method.
Is it possible to append to an fst without having to load it (completely)?
The text was updated successfully, but these errors were encountered: