Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
e96d289
First stab at finding a format to help analyse the problem
Byron Nov 19, 2021
f14cd61
Finish technical problems and solutions (#259)
Byron Nov 20, 2021
0c02c13
Analyis about loose refs db (#259)
Byron Nov 20, 2021
3ada0c9
Read up on ref-table a little and fill in some details (#259)
Byron Nov 20, 2021
3bdf6c3
Write about git configuration to assert it doesn't affect the discove…
Byron Nov 20, 2021
a6d73e4
beef up problem/solution list and add git index info for completeness…
Byron Nov 20, 2021
a0bcae0
get started with analysing changes (soon changes in parallel) (#259)
Byron Nov 20, 2021
c4f1db0
Analysis of repository changes in presence of caches and lack of atom…
Byron Nov 20, 2021
6494776
Add CLI example… (#259)
Byron Nov 20, 2021
ac8f594
Add professional git hosting server usecase (#259)
Byron Nov 20, 2021
55773b8
Describe and propose fix for ref namespace-sharing issue (#259)
Byron Nov 21, 2021
ca109c1
A use-case for an intranet scale repository server (without problems/…
Byron Nov 21, 2021
2028f58
Finish zero-conf easy-maintenance server problem statement (#259)
Byron Nov 21, 2021
c042072
First thoughts on how to tackle the pack problem (#259)
Byron Nov 21, 2021
63dd57b
Some more thoughts, it _could_ even work (#259)
Byron Nov 21, 2021
22caa6c
Clarification of thoughts, onto something here (#259)
Byron Nov 21, 2021
2e62eb8
More clarification around Policies, which makes `Repository` alone un…
Byron Nov 21, 2021
b7eaf74
Some more ideas that will modify what `Repository` is fundamentally (…
Byron Nov 21, 2021
dad4bf9
even more notes about implementation details (#259)
Byron Nov 21, 2021
761c63a
add experiment for type system (#259)
Byron Nov 21, 2021
cf17816
A first stab at sketching types (#259)
Byron Nov 22, 2021
882dde9
FAIL: try to make thread-safety togglable in just git-features (#259)
Byron Nov 22, 2021
3ce7f30
Sketch types across git-features, git-odb and git-repository… (#259)
Byron Nov 22, 2021
1e86148
A few more steps towards implementation… (#259)
Byron Nov 22, 2021
e6ae34f
use generic Repository type with typedef default to avoid macro-madne…
Byron Nov 22, 2021
e779311
Now it's getting somewhere, with the potential for resource pooling (…
Byron Nov 22, 2021
4b84a48
All logic for handling eager refreshing… (#259)
Byron Nov 22, 2021
1837abc
change direction towards something simpler that will still be fast (#…
Byron Nov 22, 2021
5e9250f
Turns out the new `PolicyStore` can co-exist with existing one… (#259)
Byron Nov 23, 2021
ab1562e
some more clarity around generations and cache updates (#259)
Byron Nov 23, 2021
3c790e0
remove trait bounds to allow single-threaded applications to exist (#…
Byron Nov 23, 2021
c805d0b
unify trait bounds for parallel code: prefer Clone over Sync (#259)
Byron Nov 23, 2021
4c00f78
Sketch the future of the ref-db with namespace fix and upcoming refta…
Byron Nov 23, 2021
8cafc8d
a path forward to deciding which ODB implementation to pick (#259)
Byron Nov 23, 2021
977bd80
This experiment shows that working through an Arc isn't slower than t…
Byron Nov 23, 2021
d40288a
Show that arcs can cost up to 16.5% when having 10mio ops compared to…
Byron Nov 23, 2021
9696383
RwLocks are preferred over Mutexes if there is a decent read-only pat…
Byron Nov 23, 2021
4160b4a
Adjust unload mode to be a policy wide settings, probably along some …
Byron Nov 23, 2021
3d533a1
Fix single-threaded version of design-sketch (#259)
Byron Nov 23, 2021
f90207d
More work towards understanding indices and multi-pack indices for ev…
Byron Nov 23, 2021
3fce8f2
sketch a little more how packs could be accessed (#259)
Byron Nov 24, 2021
babfb7b
figure out how pack-ids can be stable even though multi-packs are sti…
Byron Nov 24, 2021
fae3e7c
A better idea on how multi-pack indices (and changes to them) work (#…
Byron Nov 24, 2021
5124abf
A big step towards getting IDs straight, but… (#259)
Byron Nov 24, 2021
5180f11
Flesh out Sync requirement for Repository (#259)
Byron Nov 25, 2021
70dc445
Make handles tracked to allow automatic handling of pack-unloading (#…
Byron Nov 25, 2021
ef9b08c
Learn about store's resource consumption (#259)
Byron Nov 25, 2021
58b0bcf
Get a better understanding on how on-disk state can be represented in…
Byron Nov 25, 2021
6187b5c
sketch a more concise way of keeping indices and packs relative to di…
Byron Nov 25, 2021
407be6c
Simplify tracking of handles, using tokens to make that safer (#259)
Byron Nov 25, 2021
d8d5a60
refactor
Byron Nov 25, 2021
6f5eaf1
Allow on-disk files to be reused as will in case they re-appear (#259)
Byron Nov 25, 2021
a88981b
btree/hashmap free lookup of packs in store, keeping things more bund…
Byron Nov 25, 2021
87544c3
try loading actual packs with the current data structure… (#259)
Byron Nov 25, 2021
2977518
workaround borrow check, which in this case fits nicely (#259)
Byron Nov 25, 2021
a5a6a78
realize that pack lookup needs the marker or wrong packs might be ret…
Byron Nov 25, 2021
a14e4d0
A good way to avoid using a potentially wrong pack (#259)
Byron Nov 25, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ members = [
"experiments/object-access",
"experiments/diffing",
"experiments/traversal",
"experiments/odb-redesign",

"cargo-smart-release",

Expand Down
523 changes: 523 additions & 0 deletions DEVELOPMENT.md

Large diffs are not rendered by default.

3 changes: 1 addition & 2 deletions cargo-smart-release/src/command/release/manifest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,11 @@ use git_repository::lock::File;
use semver::{Version, VersionReq};

use super::{cargo, git, Context, Oid, Options};
use crate::utils::version_req_unset_or_default;
use crate::{
changelog,
changelog::{write::Linkables, Section},
traverse::Dependency,
utils::{names_and_versions, try_to_published_crate_and_new_version, will},
utils::{names_and_versions, try_to_published_crate_and_new_version, version_req_unset_or_default, will},
version, ChangeLog,
};

Expand Down
1 change: 1 addition & 0 deletions experiments/object-access/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ anyhow = "1"
git-repository = { version ="^0.12.0", path = "../../git-repository", features = ["unstable"] }
git2 = "0.13"
rayon = "1.5.0"
parking_lot = { version = "0.11.0", default-features = false }
151 changes: 141 additions & 10 deletions experiments/object-access/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use std::{path::Path, time::Instant};
use std::{path::Path, sync::Arc, time::Instant};

use anyhow::anyhow;
use git_repository::{hash::ObjectId, odb, prelude::*, Repository};
Expand All @@ -21,6 +21,7 @@ fn main() -> anyhow::Result<()> {
};

let objs_per_sec = |elapsed: std::time::Duration| hashes.len() as f32 / elapsed.as_secs_f32();

let start = Instant::now();
do_gitoxide_in_parallel(&hashes, &repo, || odb::pack::cache::Never, AccessMode::ObjectExists)?;
let elapsed = start.elapsed();
Expand All @@ -32,36 +33,65 @@ fn main() -> anyhow::Result<()> {
);

let start = Instant::now();
let bytes = do_gitoxide_in_parallel(
let bytes = do_gitoxide_in_parallel_through_arc(
&hashes,
&repo,
|| odb::pack::cache::lru::MemoryCappedHashmap::new(GITOXIDE_CACHED_OBJECT_DATA_PER_THREAD_IN_BYTES),
&repo.odb.dbs[0].loose.path,
odb::pack::cache::Never::default,
AccessMode::ObjectData,
)?;
let elapsed = start.elapsed();
println!(
"parallel gitoxide (cache = {:.0}MB): confirmed {} bytes in {:?} ({:0.0} objects/s)",
GITOXIDE_CACHED_OBJECT_DATA_PER_THREAD_IN_BYTES as f32 / (1024 * 1024) as f32,
"parallel gitoxide (uncached, Arc): confirmed {} bytes in {:?} ({:0.0} objects/s)",
bytes,
elapsed,
objs_per_sec(elapsed)
);

let start = Instant::now();
let bytes = do_gitoxide_in_parallel(
do_gitoxide_in_parallel_through_arc(
&hashes,
&repo,
odb::pack::cache::lru::StaticLinkedList::<GITOXIDE_STATIC_CACHE_SIZE>::default,
&repo.odb.dbs[0].loose.path,
odb::pack::cache::Never::default,
AccessMode::ObjectExists,
)?;
let elapsed = start.elapsed();
println!(
"parallel gitoxide (Arc): confirmed {} objects exists in {:?} ({:0.0} objects/s)",
hashes.len(),
elapsed,
objs_per_sec(elapsed)
);

let start = Instant::now();
let bytes = do_gitoxide_in_parallel_through_arc_rw_lock(
&hashes,
&repo.odb.dbs[0].loose.path,
odb::pack::cache::Never::default,
AccessMode::ObjectData,
)?;
let elapsed = start.elapsed();
println!(
"parallel gitoxide (small static cache): confirmed {} bytes in {:?} ({:0.0} objects/s)",
"parallel gitoxide (uncached, Arc, Lock): confirmed {} bytes in {:?} ({:0.0} objects/s)",
bytes,
elapsed,
objs_per_sec(elapsed)
);

let start = Instant::now();
do_gitoxide_in_parallel_through_arc_rw_lock(
&hashes,
&repo.odb.dbs[0].loose.path,
odb::pack::cache::Never::default,
AccessMode::ObjectExists,
)?;
let elapsed = start.elapsed();
println!(
"parallel gitoxide (Arc, Lock): confirmed {} objects exists in {:?} ({:0.0} objects/s)",
hashes.len(),
elapsed,
objs_per_sec(elapsed)
);

let start = Instant::now();
let bytes = do_gitoxide_in_parallel(&hashes, &repo, odb::pack::cache::Never::default, AccessMode::ObjectData)?;
let elapsed = start.elapsed();
Expand All @@ -72,6 +102,37 @@ fn main() -> anyhow::Result<()> {
objs_per_sec(elapsed)
);

let start = Instant::now();
let bytes = do_gitoxide_in_parallel(
&hashes,
&repo,
|| odb::pack::cache::lru::MemoryCappedHashmap::new(GITOXIDE_CACHED_OBJECT_DATA_PER_THREAD_IN_BYTES),
AccessMode::ObjectData,
)?;
let elapsed = start.elapsed();
println!(
"parallel gitoxide (cache = {:.0}MB): confirmed {} bytes in {:?} ({:0.0} objects/s)",
GITOXIDE_CACHED_OBJECT_DATA_PER_THREAD_IN_BYTES as f32 / (1024 * 1024) as f32,
bytes,
elapsed,
objs_per_sec(elapsed)
);

let start = Instant::now();
let bytes = do_gitoxide_in_parallel(
&hashes,
&repo,
odb::pack::cache::lru::StaticLinkedList::<GITOXIDE_STATIC_CACHE_SIZE>::default,
AccessMode::ObjectData,
)?;
let elapsed = start.elapsed();
println!(
"parallel gitoxide (small static cache): confirmed {} bytes in {:?} ({:0.0} objects/s)",
bytes,
elapsed,
objs_per_sec(elapsed)
);

let start = Instant::now();
let bytes = do_parallel_git2(hashes.as_slice(), repo.git_dir())?;
let elapsed = start.elapsed();
Expand Down Expand Up @@ -213,3 +274,73 @@ where

Ok(bytes.load(std::sync::atomic::Ordering::Acquire))
}

fn do_gitoxide_in_parallel_through_arc<C>(
hashes: &[ObjectId],
repo: &Path,
new_cache: impl Fn() -> C + Send + Clone,
mode: AccessMode,
) -> anyhow::Result<u64>
where
C: odb::pack::cache::DecodeEntry,
{
let bytes = std::sync::atomic::AtomicU64::default();
let odb = Arc::new(git_repository::odb::linked::Store::at(repo)?);

git_repository::parallel::in_parallel(
hashes.chunks(1000),
None,
move |_| (Vec::new(), new_cache(), odb.clone()),
|hashes, (buf, cache, odb)| {
for hash in hashes {
match mode {
AccessMode::ObjectData => {
let obj = odb.find(hash, buf, cache)?;
bytes.fetch_add(obj.data.len() as u64, std::sync::atomic::Ordering::Relaxed);
}
AccessMode::ObjectExists => {
assert!(odb.contains(hash), "each traversed object exists");
}
}
}
Ok(())
},
git_repository::parallel::reduce::IdentityWithResult::<(), anyhow::Error>::default(),
)?;
Ok(bytes.load(std::sync::atomic::Ordering::Acquire))
}

fn do_gitoxide_in_parallel_through_arc_rw_lock<C>(
hashes: &[ObjectId],
repo: &Path,
new_cache: impl Fn() -> C + Send + Clone,
mode: AccessMode,
) -> anyhow::Result<u64>
where
C: odb::pack::cache::DecodeEntry,
{
let bytes = std::sync::atomic::AtomicU64::default();
let odb = Arc::new(parking_lot::RwLock::new(git_repository::odb::linked::Store::at(repo)?));

git_repository::parallel::in_parallel(
hashes.chunks(1000),
None,
move |_| (Vec::new(), new_cache(), odb.clone()),
|hashes, (buf, cache, odb)| {
for hash in hashes {
match mode {
AccessMode::ObjectData => {
let obj = odb.read().find(hash, buf, cache)?;
bytes.fetch_add(obj.data.len() as u64, std::sync::atomic::Ordering::Relaxed);
}
AccessMode::ObjectExists => {
assert!(odb.read().contains(hash), "each traversed object exists");
}
}
}
Ok(())
},
git_repository::parallel::reduce::IdentityWithResult::<(), anyhow::Error>::default(),
)?;
Ok(bytes.load(std::sync::atomic::Ordering::Acquire))
}
19 changes: 19 additions & 0 deletions experiments/odb-redesign/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[package]
name = "odb-redesign"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[features]
default = ["thread-safe"]
thread-safe = []

[dependencies]
git-pack = { path = "../../git-pack", version = "*" }
git-odb = { path = "../../git-odb", version = "*" }
git-hash = { path = "../../git-hash", version ="^0.8.0" }
git-ref = { path = "../../git-ref", version ="^0.9.1" }
parking_lot = { version = "0.11.0", default-features = false }
thiserror = "1.0.30"
anyhow = "1.0.47"
Loading