Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make PM mode work as a sane person might expect #174

Merged
merged 14 commits into from
Apr 25, 2024

Conversation

jcrossley3
Copy link
Contributor

@jcrossley3 jcrossley3 commented Apr 16, 2024

Fixes #171

And resolves #185 in favor of subcommands:

  • trustd api
  • trustd db create|migrate|refresh
  • trustd

The latter no-arg command is "PM mode", which fires up an embedded PostgreSQL instance (installed beneath .trustify/) and then runs trustd db migrate and trustd api --auth-disabled.

@jcrossley3
Copy link
Contributor Author

FYI, i put those unwrap's in there (the ones clippy is complaining about) to force a panic because the app was just silently halting. Need to figure out the right place to init the env_logger, because putting it in trustd's main causes a WARN later when the infra configures its tracing.

What's the right way to do things around here?

@bobmcwhirter
Copy link
Contributor

I think at that point either eprintln and/or expect()

@ctron
Copy link
Contributor

ctron commented Apr 17, 2024

What's the right way to do things around here?

#135

Like here: https://github.com/trustification/trustification/blob/04a3554fa95b500ce2208c7d8ed97a96fa47788e/spog/api/src/lib.rs#L97-L112

@ctron
Copy link
Contributor

ctron commented Apr 17, 2024

We always CREATE_DATABASE and ignore "already exists" errors.

Assuming we want to run that in production at some point, we would have database creation as well as running migrations outside of the main process, in order to limit the permissions the actual database user requires.

So we at least would need a mode which just accepts an existing database, assumes that the structure is present, and does not try to modify anything beyond actual data.

@jcrossley3
Copy link
Contributor Author

Good points. I guess we need to ensure the "create db and ignore errors" behavior is only for "managed" db's. Which kind of conflicts with #173 I think.

I don't have a clear solution in mind yet, so open to ideas.

@bobmcwhirter
Copy link
Contributor

I agree that in Production we wouldn't want trustd to be creating DBs.

I'm not sure I agree that running migrations should be outside of trustd's control though.

Thinking back to my Ruby On Rails days, migrations were always in-process at boot time and that seems to have worked?

trustd/src/main.rs Outdated Show resolved Hide resolved
@bobmcwhirter
Copy link
Contributor

wrt "create db and ignore errors" maybe needs to float up out of Db and into the trustd wrapper/setup bits.

We either take an external config, or manage/bootstrap/ignore-errors up a db, and then ctor the Db with whichever database we ultimately ended up with booting up in PM-mode-or-not. From the Db perspective, they're all external. Separation of responsibility that Db just ... is a database.

Externalize the optional management of the aforementioned database.

@ctron
Copy link
Contributor

ctron commented Apr 17, 2024

I'm not sure I agree that running migrations should be outside of trustd's control though.

Thinking back to my Ruby On Rails days, migrations were always in-process at boot time and that seems to have worked?

The downside of this is, that the user connecting to the database would need some "admin" permission set. Which the application itself would not need.

Following the idea of "limiting access" … it feels better to not give a process, which receives connections from the internet, admin permissions to the DB.

So the process we've in place for GUAC is: init container -> runs migrations, actual container -> runs code … the latter one has a limited set of permissions. And that works quite well.

@ctron
Copy link
Contributor

ctron commented Apr 17, 2024

wrt panic, expect … just a few lines above this code we use ?, which I do prefer. So why break that pattern?

@bobmcwhirter
Copy link
Contributor

Ah, I'd not realized that CREATE TABLE and other migratory actions required higher permissions than simple user.

Agreed on the bifurcated migration-vs-runtime.

@bobmcwhirter
Copy link
Contributor

@jcrossley3 so I think we can still make db::DbStrategy go away, and pass in some bool for attempt_migrations for the PM-mode use-case, else we skip that branch in the init-container/production case.

@jcrossley3
Copy link
Contributor Author

@ctron With my last commit to this PR, I think I have the Infrastructure in trustd now, but I didn't have a really good example to guide me -- the trust cli in the old repo lacks Infrastructure, too. And I'm still seeing unexpected warnings when the server initializes its own Infrastructure -- I feel like that should only be done once, but I didn't know how to make the InitData::new stuff work without a context.

Here are the steps to reproduce to see the output I'm getting. I'm seeing correct logging now -- no println's -- and my exit code is correct, but I don't think I'm doing it correctly, because I have 2 "Default to INFO logging" lines: one for trustd and one for the server it starts.

[jim@fedora trustify]$ rm -rf ./.trustify/

[jim@fedora trustify]$ cargo run --bin trustd -- --auth-disabled
    Finished dev [unoptimized + debuginfo] target(s) in 0.17s
     Running `target/debug/trustd --auth-disabled`
Default to INFO logging; use RUST_LOG environment variable to change
[2024-04-17T21:06:10.358Z INFO  trustify_infrastructure::infra] Infrastructure endpoint is disabled
[2024-04-17T21:06:10.358Z INFO  trustd] setting up managed DB
[2024-04-17T21:06:15.830Z ERROR trustd] Error: deadline has elapsed

[jim@fedora trustify]$ cargo run --bin trustd -- --auth-disabled
    Finished dev [unoptimized + debuginfo] target(s) in 0.17s
     Running `target/debug/trustd --auth-disabled`
Default to INFO logging; use RUST_LOG environment variable to change
[2024-04-17T21:06:22.342Z INFO  trustify_infrastructure::infra] Infrastructure endpoint is disabled
[2024-04-17T21:06:22.342Z INFO  trustd] setting up managed DB
[2024-04-17T21:06:22.446Z INFO  trustd] postgresql installed under "/home/jim/src/trustification/trustify/.trustify/postgres"
[2024-04-17T21:06:22.446Z INFO  trustd] running on port 45441
Default to INFO logging; use RUST_LOG environment variable to change
Error initializing logging: SetLoggerError(())
[2024-04-17T21:06:22.447Z WARN  trustify_server] Authentication is disabled
[2024-04-17T21:06:22.447Z INFO  trustify_common::db] connect to postgres://trustify:trustify@localhost:45441/trustify
[2024-04-17T21:06:22.461Z ERROR trustd] error Connection Error: error returned from database: database "trustify" does not exist
    
    Caused by:
        0: error returned from database: database "trustify" does not exist
        1: error returned from database: database "trustify" does not exist
        2: database "trustify" does not exist

@jcrossley3 jcrossley3 marked this pull request as draft April 19, 2024 13:20
@jcrossley3 jcrossley3 marked this pull request as ready for review April 23, 2024 19:01
trustd/src/main.rs Outdated Show resolved Hide resolved
trustd/src/main.rs Outdated Show resolved Hide resolved
trustd/src/main.rs Outdated Show resolved Hide resolved
@jcrossley3 jcrossley3 changed the title fix: set the data_dir field to retain data across restarts fix: make PM mode work as a sane person might expect Apr 24, 2024
Fixes trustification#171

Signed-off-by: Jim Crossley <jim@crossleys.org>
Fixes trustification#135

Signed-off-by: Jim Crossley <jim@crossleys.org>
This reverts commit e6e3767.

Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
This seems to work, but if the db doesn't exist, a timeout error will
occur the first time you run trustd. Simply run it again.

Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
The default behavior of creating the data dir on a tmpfs volume is
much faster than an LVM module, so we need a bit more time.

Signed-off-by: Jim Crossley <jim@crossleys.org>
Resolves trustification#135 according to the following:

- have the infrastructure in the actual (sub-)command
- the top level main would log terminal errors using eprintln
- other, more cli like commands, can use env_logger

In a nutshell:

- cli sub-commands use env_logger
- server style commands use infrastructure

We also ensure the global env-logger is only initialized once.

Signed-off-by: Jim Crossley <jim@crossleys.org>
Fixes trustification#173

The tests fail because the PostgreSQL instance goes out of scope
without the DbStrategy::Managed variant. But we'll fix that in the
next commit using a test-context.

Signed-off-by: Jim Crossley <jim@crossleys.org>
This keeps the postgresql instance alive for the duration of the
test. Unfortunately, 2 of the tests fail:

Error: Query Error: error returned from database: could not access file "$libdir/plpgsql": No such file or directory

failures:
    graph::sbom::tests::sbom_vulnerabilities
    graph::sbom::tests::transitive_dependency_of

Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
Signed-off-by: Jim Crossley <jim@crossleys.org>
modules/fetch/src/endpoints/advisory.rs Show resolved Hide resolved
README.md Show resolved Hide resolved
common/infrastructure/src/tracing.rs Show resolved Hide resolved
};
let db = crate::db::Database::bootstrap(&config).await.unwrap();

TrustifyContext { db, postgresql }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so all of this helped remove the DbStrategy runtime ish, yeah? Yay!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partly. DbStrategy's primary benefit was keeping the PostgreSQL instance in scope for the duration of a test. Which is what the TrustifyContext now does.

pub async fn run(self) -> anyhow::Result<ExitCode> {
init_tracing("db-run", Tracing::Disabled);
use Command::*;
match self.command {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the commands stack? Do I need to invoke trustd at least twice in a fresh install, to Create and then Migrate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No and no. See the last line of Database::bootstrap: Self::new(database).await?.migrate().await

I couldn't think of a use case where I'd want to bootstrap a DB and not migrate it. But if you think we need that, we can revisit.

trustd/src/db.rs Outdated Show resolved Hide resolved
async fn run(self) -> anyhow::Result<ExitCode> {
match self.command {
Some(Command::Api(run)) => run.run().await,
Some(Command::Db(run)) => run.run().await,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which command handles the importer loops? Should the importer loops be split out from whichever one of these commands hides it? Or is that a regression and they don't run at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like you've asked about importer loops a lot, so I'm guessing my answers aren't punching through.

Currently, the API server will create a "loop" in response to a request like this:

http POST localhost:8080/api/v1/importer/redhat-sbom sbom[source]=https://access.redhat.com/security/data/sbom/beta/ sbom[keys][]=https://access.redhat.com/security/data/97f5eac4.txt#77E79ABE93673533ED09EBE2DCE3823597F5EAC4 sbom[disabled]:=false sbom[onlyPatterns][]=quarkus sbom[period]=30s sbom[v3Signatures]:=true

And you can see it like so:

http GET localhost:8080/api/v1/importer

Assuming you have https://httpie.io/ installed, try running those commands after firing up PM mode. Will that functionality suit the use cases you have in mind for importer loops? If not, what do we need to change?

}
db.run().await?;

let api = Trustd::parse_from([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the type-unsafe stringy parsing. Can we just rust up the api configured the way we like it, instead of strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I do love it, because it mirrors what the user would do. But I agree it'd be nice for the compiler to complain if we subsequently change the CLI.

Interestingly, my first pass was as you suggest, but I found myself wanting to implement Default for things that are already kind of implemented in the form of clap annotations that I didn't want to replicate. And I wanted to take advantage of that clap config, hence parse_from.

It's definitely a tradeoff, but for something like "PM mode", I was ok with it. Better than ok, actually, but if you insist...

@jcrossley3 jcrossley3 mentioned this pull request Apr 25, 2024
Signed-off-by: Jim Crossley <jim@crossleys.org>
@jcrossley3 jcrossley3 added this pull request to the merge queue Apr 25, 2024
Merged via the queue into trustification:main with commit 9e274df Apr 25, 2024
1 check passed
@jcrossley3 jcrossley3 deleted the issue-171 branch April 25, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Determine if PM-Mode retains the DB across executions
3 participants