Skip to content

Some ideas #9

Closed
Closed
@safinaskar

Description

@safinaskar

Thanks for this project!

Some ideas (note that I merely have read blog post and didn't dig futher):

  • This may be good idea to fully replicate git's CLI. At least as an option. This will help spreading the project
  • Migrate away from SHA1. It is broken. It is one very unfortunate git's design mistake. Also, you should change hashes regularly anyway: https://valerieaurora.org/hash.html . (Well, actual migrating from SHA1 will likely break github compatibility, so, of course, it makes sense to support SHA1 for now. But please support other hashes, too. Don't repeat git's mistake: git simply hardcoded SHA1 everywhere originally.)
  • In the past I spent a lot of time researching CDC-and-deduplication. My findings are here: casync decompresses x1.5 faster than borg on same config (and other benchmarks) borgbackup/borg#7674 . Short overview of FOSS solutions is here: https://lobste.rs/s/0itosu/look_at_rapidcdc_quickcdc#c_ygqxsl . In short, existing solutions are under-optimized, and there is a lot of low handling fruit here. I was able very easily create very small program in Rust, which beats existing deduplication solutions by wide margin (but my program doesn't use CDC). So I suggest reading my ideas and comparing speed of your solution with other solutions
  • Patch-based merging seems to be killer feature (assuming it works well). So, I suggest making it main ad strategy. Linux devs often maintain their patchsets as series of patch files, not as git branches, exactly because git merging doesn't work well. So, reach Linux devs and tell them about your tool. In particular, person number 2 in Linux, Greg KH, maintainer of stable Linux trees, stores his stable trees as series of patch files in git (aaaah!). Here he describes his workflow: http://www.kroah.com/log/blog/2019/08/14/patch-workflow-with-mutt-2019/ . Key parts are these: "The stable kernel tree, while under development, is kept as a series of patches that need to be applied to the previous release. This series of patches is maintained by using a tool called (quilt)... Anyway, the stable patches are kept in a quilt series in a repository that is kept under version control in git (complex, yeah, sorry.) That queue can always be found (here)". Same applies to a lot of Debian packages. For example, gcc (and lots of other Debian packages) is, again, maintained as patches-stored-in-git. See here https://salsa.debian.org/toolchain-team/gcc/-/tree/gcc-14-debian/debian/patches . I think this is, again, because of git merge and git rebase problems. So, spread your xit as tool to solve all these problems. Of course, it helps if you are CLI-compatible with git
  • "If the first byte is 0, it is uncompressed; if it is 1, it is zlib-compressed". I suggest moving to zstd, it is better in every way (faster and smaller). Also, zstd may be good in compressing binary files (at least I hope zstd doesn't do them sufficiently larger). "While xit has compression support, it currently disables it even for text files". Try zstd -0, it is fast enough, while giving substantial compression for text files. If it is too slow, try lz4, it is even faster
  • "Want to find the descendent(s) of a commit? Uhhh...well, you can't". As pointed out on lobsters, you can see descendants: https://lobste.rs/s/mltpfg/xit_is_coming#c_cnwsps . (But I understand your point, i. e. you argue that we need separate data structure for this)

Feel free to ask any questions.

Also: even if you implement all these, I still do not plan to use xit. (I'm not trying to insult you, I just am trying to be honest here about my motivations.)

Also, there is discussion of your project here https://lobste.rs/s/mltpfg/xit_is_coming . If you want, I can give you invite

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions