-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce storage module #114
Conversation
dafaa20
to
0356a79
Compare
I don't love having `retrieve_sync` and `retrieve_buf` as part of the primary Storage trait, since it seems they only exist because serde can't deserialize async streams. If there's no more elegant way to overcome that, I'd prefer we break them out into some sort of "helper class". Ideally, the storage trait would have only retrieve and store. Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
0ac80b3
to
84e867f
Compare
Remove advisory format for the time being Signed-off-by: Jim Crossley <jim@crossleys.org>
84e867f
to
9dfa13e
Compare
Includes the introduction of a Format enum with tests for both OSV and CSAF formats. Signed-off-by: Jim Crossley <jim@crossleys.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything I've commented on is just nit-picky.
I defer to your discretion as to what you want to address in this PR or in follow-ups.
I also will assume the burden of rebasing my subsequent PR so you can get this one landed.
self.ingestor | ||
.ingest( | ||
&location, | ||
Format::CSAF, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typing while I read, this may be answered below but...
Can we link the Format::CSAF/OSV etc to the actual path param in the REST API, instead of being stringly-typed as I left it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this question will be answered by #122 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh. Kk!
|
||
/// run the importer loop | ||
pub async fn importer(db: Database) -> anyhow::Result<()> { | ||
Server { db }.run().await | ||
pub async fn importer(db: Database, storage: DispatchBackend) -> anyhow::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a variable named storage
of a type named DispatchBackend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno, honestly. There is a comment left by @ctron in dispatch.rs
but I didn't fully grok it.
payload: web::Payload, | ||
web::Query(UploadAdvisoryQuery { location, format }): web::Query<UploadAdvisoryQuery>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving to query params instead of path param? I'm fine with that. We should probably start a Tenets
document or discussion to ensure we keep consistent and how we decide which option to choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I don't have a strong opinion either way, I was just reusing the Query struct.
} | ||
let fmt = format | ||
.map(|f| Format::from_str(&f)) | ||
.unwrap_or(Ok(Format::CSAF))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default to CSAF format? If someone POSTs in a RANDOFORMAT, which is not actually CSAF, then we'll end up with a bunch of "this is invalid CSAF" errors instead of "We don't support RANDOFORMAT" yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My logic may be flawed. How should it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe .ok_or(Error::InvalidFormat)?
which will turn a None into a throw error, or a Some into the underlying value without unwrap. Sorta the err-variant of unwrap_or(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my logic is correct: I'm mapping and unwrapping an Option
containing a Result
. So the unwrap_or
is returning an Err
if the format param is invalid. The Ok(Format::CSAF)
is only returned if format
is None
.
See #126
} | ||
|
||
#[test_log::test(actix_web::test)] | ||
async fn upload_osv() -> Result<(), anyhow::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
.retrieve(sha256.clone()) | ||
.await | ||
.map_err(Error::Storage)? | ||
.ok_or_else(|| Error::Storage(anyhow!("file went missing during upload")))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For bonus points, do we want to re-hash to also ensure file contents didn't change during the shuffling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think being able to successfully retrieve it ensures that, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno tbh. I think it means we managed to find some file at the hash-named location. Belt-and-suspenders would then verify, I guess, if the file retrieved by hash, when hashed, hashes to the hash of the hash-based file.
Yo dawg, I heard you liked hash, so I put some hash in your hash so you can hash while you hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean, verify we got what we put.
.ok_or_else(|| Error::Storage(anyhow!("file went missing during upload")))?; | ||
|
||
let result = match fmt { | ||
Format::CSAF => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but when we add a 3rd format, probably smart to shuffle into a fn
on Format
itself to handle the delegation to the appropriate loader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
Filesystem(FileSystemBackend), | ||
} | ||
|
||
impl StorageBackend for DispatchBackend { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, DispatchBackend
is-a form of storage. Ignore previous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reply to these in order. 😄
const NUM_LEVELS: usize = 2; | ||
|
||
impl FileSystemBackend { | ||
pub async fn new(base: impl Into<PathBuf>) -> anyhow::Result<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my pending PR for trustd
. We should maybe default to .trustify/...whatever
for the Simple Happy PM Use-Case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not this PR though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this go in #121 ?
|
||
let hash = hex::encode(digest); | ||
let target = level_dir(&self.content, &hash, NUM_LEVELS); | ||
create_dir_all(&target).await.map_err(StoreError::Backend)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we sprinkle some from
derives or impl From<..>
to avoid having to map_err(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. I expect much of the storage module to evolve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to tick the [approve] box.
This includes the commits from #113 and resolves the conflicts incurred from #116
I modified the
StorageBackend
a tad, breaking out the "faux sync" methods into a helper class.