Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big objects in the repo and overall repo size #339

Closed
little-arhat opened this issue Jul 6, 2020 · 8 comments
Closed

Big objects in the repo and overall repo size #339

little-arhat opened this issue Jul 6, 2020 · 8 comments

Comments

@little-arhat
Copy link

Hello!

I'm using nigtly and therefore have to use rtic from repo. I did cargo update recently and noticed that rtic-rs update took much longer than cortex-m.

I used script to find large objects in the repo:

All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file.
size  pack  SHA                                       location
2103  47    08c4a85f54d6b3c9fb278e4151f592015b9313c9  0.5/api/src/typenum/home/runner/work/cortex-m-rtic/cortex-m-rtic/target/debug/build/typenum-59ccc0bf39fe0693/out/consts.rs.html
2103  47    2581f277d9768d66922c776b16413a620befddbf  0.5/api/src/typenum/home/runner/work/cortex-m-rtfm/cortex-m-rtfm/target/debug/build/typenum-59ccc0bf39fe0693/out/consts.rs.html
2103  47    fd2f514d749683816e41897ca084c1fb1db2bc5f  0.5/api/src/typenum/home/runner/work/cortex-m-rtic/cortex-m-rtic/target/debug/build/typenum-a0f5187ff8fa831c/out/consts.rs.html
824   100   d749fa99efbdf30a9eaac2d481c2aa8cb096beef  0.4/api/search-index.js
771   98    f5334325ce89c6a88c3fced2d12f31cec0e45db3  0.5/api/search-index.js
729   32    49235c3dca5c799f6a37c07c9ca2cba2db71c332  0.5/api/src/typenum/home/travis/build/rtfm-rs/cortex-m-rtfm/target/debug/build/typenum-66dd1c384d5e8887/out/consts.rs.html
653   80    9877fd178878f7f402df0e2a7cb595ef878f53de  0.5/book/en/searchindex.json
616   31    3a1619df6a4b4fbfb5188e96cfe73f6c86d09903  0.4/api/src/typenum/tmp/tmp.tH98GYNQFY/target/debug/build/typenum-54cf2c9546c781c7/out/consts.rs.html
503   40    3dd91184ea55520a74466f0a975a12c19bb873f9  0.4/api/src/syn/item.rs.html
498   32    12dcb51fa96cb924459a7e094b758ddf17441479  0.5/api/heapless/consts/index.html

Unfrotunately, I couldn't find origin commit with git log --all --pretty=format:%H | xargs -n1 -I% sh -c "git ls-tree % | grep -q 08c4a85f54d6b3c9fb278e4151f592015b9313c9 && echo %".

@korken89
Copy link
Collaborator

Hi,

You seem to be cloning the docs branch as well (which is huge).
I'd recommend you to modify your clone command to not pull all branches.

@AfoHT
Copy link
Contributor

AfoHT commented Jul 21, 2020

Hi!

I've been looking into this a bit as well, and it seems we have huge things in the repo history, a regular clone (no options) only fetches the master branch by default as far as I can tell, for comparison see the second command with branch specified:

❯ git clone git@github.com:rtic-rs/cortex-m-rtic.git
Cloning into 'cortex-m-rtic'...
remote: Enumerating objects: 51250, done.
remote: Counting objects: 100% (51250/51250), done.
remote: Compressing objects: 100% (3615/3615), done.
remote: Total 211241 (delta 50729), reused 47935 (delta 47585), pack-reused 159991
Receiving objects: 100% (211241/211241), 37.70 MiB | 4.57 MiB/s, done.
Resolving deltas: 100% (207295/207295), done.
❯ git clone -b master git@github.com:rtic-rs/cortex-m-rtic.git cortex-m-rtic-master
Cloning into 'cortex-m-rtic-master'...
remote: Enumerating objects: 51250, done.
remote: Counting objects: 100% (51250/51250), done.
remote: Compressing objects: 100% (3615/3615), done.
remote: Total 211241 (delta 50729), reused 47935 (delta 47585), pack-reused 159991
Receiving objects: 100% (211241/211241), 37.70 MiB | 4.37 MiB/s, done.
Resolving deltas: 100% (207295/207295), done.

With the help of git sizer:

❯ git sizer --verbose                                                                                                                                                                 
Processing blobs: 271747                                                                                                                                                              
Processing trees: 3576                                                                                                                                                                
Processing commits: 755  
Matching commits to trees: 755                        
Processing annotated tags: 21                        
Processing references: 135                        
| Name                         | Value     | Level of concern               |
| ---------------------------- | --------- | ------------------------------ |
| Overall repository size      |           |                                |
| * Commits                    |           |                                |
|   * Count                    |   755     |                                |
|   * Total size               |   275 KiB |                                |
| * Trees                      |           |                                |
|   * Count                    |  3.58 k   |                                |
|   * Total size               |  12.1 MiB |                                |
|   * Total tree entries       |   300 k   |                                |
| * Blobs                      |           |                                |
|   * Count                    |   272 k   |                                |
|   * Total size               |  1.47 GiB |                                |
| * Annotated tags             |           |                                |
|   * Count                    |    21     |                                |
| * References                 |           |                                |
|   * Count                    |   135     |                                |
|                              |           |                                |
| Biggest objects              |           |                                |
| * Commits                    |           |                                |
|   * Maximum size         [1] |  5.37 KiB |                                |
|   * Maximum parents      [2] |     4     |                                |
| * Trees                      |           |                                |
|   * Maximum entries      [3] |  3.30 k   | ***                            |
| * Blobs                      |           |                                |
|   * Maximum size         [4] |  2.06 MiB |                                |
|                              |           |                                |
| History structure            |           |                                |
| * Maximum history depth      |   496     |                                |
| * Maximum tag depth      [5] |     1     |                                |
|                              |           |                                |
| Biggest checkouts            |           |                                |
| * Number of directories  [6] |   520     |                                |
| * Maximum path depth     [7] |    15     | *                              |
| * Maximum path length    [7] |   127 B   | *                              |
| * Number of files        [7] |  30.2 k   |                                |
| * Total size of files    [7] |   185 MiB |                                |
| * Number of symlinks         |     0     |                                |
| * Number of submodules       |     0     |                                |

[1]  e925a3e38f493b835356fb94dc118236ed217708 (refs/remotes/origin/root)
[2]  9a974585d0c3e20e860e5d1ad4cf9df501d229d9
[3]  b335fe457ba3a960746bb5bd00c95d160bc368f1 (refs/remotes/upstream/gh-pages:0.4/api/typenum)
[4]  ae0c70624634faa6581abb505c4436c3b5f82402 (5a331d3e75a697a127670085555ccb1a1722c4f2:0.5/api/src/typenum/home/travis/build/rtfm-rs/cortex-m-rtfm/target/debug/build/typenum-045f2683e9dd584d/out/consts.rs.html)
[5]  03adc6ed6d3ad64d0656f25a19ad31fb3860bb56 (refs/tags/v0.1.0)
[6]  940dd3efa7373f9e6d018d63a305f3f868d8cde8 (5a331d3e75a697a127670085555ccb1a1722c4f2^{tree})
[7]  80e42af62373cc34a262a704f52d4b821a709413 (refs/remotes/upstream/gh-pages^{tree})

The largest object [4]:

git show 5a331d3e75a697a127670085555ccb1a1722c4f2
commit 5a331d3e75a697a127670085555ccb1a1722c4f2                                                                                                                                       
Author: Travis CI User <travis@example.org>                                                                                                                                           
Date:   Fri Nov 15 00:46:04 2019 +0000                                                                                                                                                

    Update documentation
<...>

🤔

❯ git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {print substr($0,6)}' | sort --numeric-sort --key=2 -r | head -n 20
ae0c70624634faa6581abb505c4436c3b5f82402 2155246 0.5/api/src/typenum/home/travis/build/rtfm-rs/cortex-m-rtfm/target/debug/build/typenum-045f2683e9dd584d/out/consts.rs.html
64e427c1b6ccdf2592405c155c6b8c0d5a80019c 2155045 0.4/api/src/typenum/tmp/tmp.ePvTrX26Tz/target/debug/build/typenum-045f2683e9dd584d/out/consts.rs.html
fd2f514d749683816e41897ca084c1fb1db2bc5f 2154096 0.5/api/src/typenum/home/runner/work/cortex-m-rtic/cortex-m-rtic/target/debug/build/typenum-a0f5187ff8fa831c/out/consts.rs.html
b76b0bf220ecbbf2ce0bed3138b8e069891b202c 2154096 0.5/api/src/typenum/home/runner/work/cortex-m-rtfm/cortex-m-rtfm/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
2581f277d9768d66922c776b16413a620befddbf 2154096 0.5/api/src/typenum/home/runner/work/cortex-m-rtfm/cortex-m-rtfm/target/debug/build/typenum-59ccc0bf39fe0693/out/consts.rs.html
08c4a85f54d6b3c9fb278e4151f592015b9313c9 2154096 0.5/api/src/typenum/home/runner/work/cortex-m-rtic/cortex-m-rtic/target/debug/build/typenum-59ccc0bf39fe0693/out/consts.rs.html
49235c3dca5c799f6a37c07c9ca2cba2db71c332 2154091 0.5/api/src/typenum/home/travis/build/rtfm-rs/cortex-m-rtfm/target/debug/build/typenum-66dd1c384d5e8887/out/consts.rs.html
9572ed1ba17a3c1976934ae859bd39ba283e210d 2154026 0.5/api/src/typenum/home/runner/work/cortex-m-rtic/cortex-m-rtic/target/debug/build/typenum-d39757dc6297ced2/out/consts.rs.html
f898d846875f7568bb89315f544d480a2a239b6c 2153890 0.4/api/src/typenum/tmp/tmp.9YHwJJ9MeT/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
d364ffa3d12a9e75cbfc676caae0ddfbbe100042 2153890 0.4/api/src/typenum/tmp/tmp.jwIkPYgSwD/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
a7fc5f2c9e8f68586bac6d39f6152786cdd4ca41 2153890 0.4/api/src/typenum/tmp/tmp.ei4RSn3ks2/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
95129742e88abfd9dcc4f15d32021d79381de69a 2153890 0.4/api/src/typenum/tmp/tmp.7hwkApZG2b/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
8eb7629137b2ac3e8206932302fd5d341109dd2d 2153890 0.4/api/src/typenum/tmp/tmp.bDQWcG3k93/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
8b4a5ec7fb035f9491b92bb8d69cfc6356354f1b 2153890 0.4/api/src/typenum/tmp/tmp.TRIH1hUSRA/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
815ba8da234668f5a1059f62cb2f14166dfdaa28 2153890 0.4/api/src/typenum/tmp/tmp.rPR6Z9c7xv/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
728a136d908e07a1d102cd53f21ad824b5eef795 2153890 0.4/api/src/typenum/tmp/tmp.NrP13DtGDZ/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
608eed869750338b021b384fb3408f3252246de1 2153890 0.4/api/src/typenum/tmp/tmp.xKHBf2Kw71/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
54e01f24a37e7f7792427c3ffbaa8b33acd5c1db 2153890 0.4/api/src/typenum/tmp/tmp.KglORjM7xc/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html
5334f6e3574856cda9285ff62abdb3552247ae41 2153890 0.4/api/src/typenum/tmp/tmp.IzGxYC2BjF/target/debug/build/typenum-b1b4070d89d9c89e/out/consts.rs.html
48146e4a0f9d6084d7da7479b9a85069576e5ecf 2153890 0.4/api/src/typenum/tmp/tmp.lPbq1RF9hN/target/debug/build/typenum-5f51f6e2dd2ce5fa/out/consts.rs.html

Those tmp folders doesn't look right

I'm not sure yet where the data is hiding, but I think this needs further investigation. Thanks @little-arhat for bringing it to our attention!

@little-arhat
Copy link
Author

little-arhat commented Jan 23, 2021

Given that last released version (0.5.5) is not compatible with recent cortex-m, users who want to stay on cortex-m >= 0.7 have to use git version, and given repo sizes cargo fetch takes a_lot of time. Sad.

(Cargo itself doesn't support shallow clones it seems: rust-lang/cargo#1171)

@korken89
Copy link
Collaborator

korken89 commented Mar 4, 2021

Issue cleanup: Please reopen if this is still active.

@korken89 korken89 closed this as completed Mar 4, 2021
@little-arhat
Copy link
Author

little-arhat commented Mar 5, 2021

I would say this is still active, as no progress was made neither in cargo (to support shallow clones) nor here (to cleanup history) :)

@AfoHT
Copy link
Contributor

AfoHT commented Mar 18, 2021

Reopening, I think especially true for all testing out the latest alpha with a git-upstream.

The support for shallow clones is cargo is of course outside our reach, leaving the option to somehow trim the history without breaking too much.

Suggestions are welcome, I'll try to dig deeper into this.

@AfoHT
Copy link
Contributor

AfoHT commented Apr 9, 2021

@little-arhat Please test this again after yesterday's cleanup.

For details see the discussion in the RFC repo linked.

@AfoHT
Copy link
Contributor

AfoHT commented Apr 19, 2021

After a few commits to master, including documentation rebuilds and publishing I get this much nicer total repo size:

❯ git clone --mirror git@github.com:rtic-rs/cortex-m-rtic.git 
Cloning into bare repository 'cortex-m-rtic.git'...
remote: Enumerating objects: 43862, done.
remote: Counting objects: 100% (32554/32554), done.
remote: Compressing objects: 100% (1274/1274), done.
remote: Total 43862 (delta 31870), reused 31789 (delta 31211), pack-reused 11308
Receiving objects: 100% (43862/43862), 11.05 MiB | 1.31 MiB/s, done.
Resolving deltas: 100% (40018/40018), done.

/tmp took 16s 
❯ 

So I consider this resolved, if not, please reopen.

@AfoHT AfoHT closed this as completed Apr 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants