Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce storage module #114

Merged
merged 8 commits into from
Mar 28, 2024

Conversation

jcrossley3
Copy link
Contributor

@jcrossley3 jcrossley3 commented Mar 25, 2024

This includes the commits from #113 and resolves the conflicts incurred from #116

I modified the StorageBackend a tad, breaking out the "faux sync" methods into a helper class.

ctron and others added 6 commits March 27, 2024 17:31
I don't love having `retrieve_sync` and `retrieve_buf` as part of the
primary Storage trait, since it seems they only exist because serde
can't deserialize async streams. If there's no more elegant way to
overcome that, I'd prefer we break them out into some sort of "helper
class".

Ideally, the storage trait would have only retrieve and store.

Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
@jcrossley3 jcrossley3 force-pushed the feature/storage_1 branch 2 times, most recently from 0ac80b3 to 84e867f Compare March 28, 2024 01:25
Remove advisory format for the time being

Signed-off-by: Jim Crossley <jim@crossleys.org>
Includes the introduction of a Format enum with tests for both OSV and
CSAF formats.

Signed-off-by: Jim Crossley <jim@crossleys.org>
@jcrossley3 jcrossley3 changed the title Manifest review of #113 Introduce storage module Mar 28, 2024
@jcrossley3 jcrossley3 marked this pull request as ready for review March 28, 2024 15:52
Copy link
Contributor

@bobmcwhirter bobmcwhirter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything I've commented on is just nit-picky.

I defer to your discretion as to what you want to address in this PR or in follow-ups.

I also will assume the burden of rebasing my subsequent PR so you can get this one landed.

self.ingestor
.ingest(
&location,
Format::CSAF,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typing while I read, this may be answered below but...

Can we link the Format::CSAF/OSV etc to the actual path param in the REST API, instead of being stringly-typed as I left it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this question will be answered by #122 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Kk!


/// run the importer loop
pub async fn importer(db: Database) -> anyhow::Result<()> {
Server { db }.run().await
pub async fn importer(db: Database, storage: DispatchBackend) -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a variable named storage of a type named DispatchBackend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, honestly. There is a comment left by @ctron in dispatch.rs but I didn't fully grok it.

payload: web::Payload,
web::Query(UploadAdvisoryQuery { location, format }): web::Query<UploadAdvisoryQuery>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving to query params instead of path param? I'm fine with that. We should probably start a Tenets document or discussion to ensure we keep consistent and how we decide which option to choose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I don't have a strong opinion either way, I was just reusing the Query struct.

}
let fmt = format
.map(|f| Format::from_str(&f))
.unwrap_or(Ok(Format::CSAF))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default to CSAF format? If someone POSTs in a RANDOFORMAT, which is not actually CSAF, then we'll end up with a bunch of "this is invalid CSAF" errors instead of "We don't support RANDOFORMAT" yeah?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic may be flawed. How should it work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe .ok_or(Error::InvalidFormat)? which will turn a None into a throw error, or a Some into the underlying value without unwrap. Sorta the err-variant of unwrap_or(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my logic is correct: I'm mapping and unwrapping an Option containing a Result. So the unwrap_or is returning an Err if the format param is invalid. The Ok(Format::CSAF) is only returned if format is None.

See #126

}

#[test_log::test(actix_web::test)]
async fn upload_osv() -> Result<(), anyhow::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

.retrieve(sha256.clone())
.await
.map_err(Error::Storage)?
.ok_or_else(|| Error::Storage(anyhow!("file went missing during upload")))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For bonus points, do we want to re-hash to also ensure file contents didn't change during the shuffling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being able to successfully retrieve it ensures that, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno tbh. I think it means we managed to find some file at the hash-named location. Belt-and-suspenders would then verify, I guess, if the file retrieved by hash, when hashed, hashes to the hash of the hash-based file.

Yo dawg, I heard you liked hash, so I put some hash in your hash so you can hash while you hash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, verify we got what we put.

.ok_or_else(|| Error::Storage(anyhow!("file went missing during upload")))?;

let result = match fmt {
Format::CSAF => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but when we add a 3rd format, probably smart to shuffle into a fn on Format itself to handle the delegation to the appropriate loader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Filesystem(FileSystemBackend),
}

impl StorageBackend for DispatchBackend {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, DispatchBackend is-a form of storage. Ignore previous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reply to these in order. 😄

const NUM_LEVELS: usize = 2;

impl FileSystemBackend {
pub async fn new(base: impl Into<PathBuf>) -> anyhow::Result<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my pending PR for trustd. We should maybe default to .trustify/...whatever for the Simple Happy PM Use-Case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not this PR though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go in #121 ?


let hash = hex::encode(digest);
let target = level_dir(&self.content, &hash, NUM_LEVELS);
create_dir_all(&target).await.map_err(StoreError::Backend)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we sprinkle some from derives or impl From<..> to avoid having to map_err(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I expect much of the storage module to evolve.

Copy link
Contributor

@bobmcwhirter bobmcwhirter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to tick the [approve] box.

@jcrossley3 jcrossley3 added this pull request to the merge queue Mar 28, 2024
Merged via the queue into trustification:main with commit 70c8baa Mar 28, 2024
3 checks passed
@jcrossley3 jcrossley3 deleted the feature/storage_1 branch March 28, 2024 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants