Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a --check subcommand #157

Open
juntyr opened this issue Jun 16, 2024 · 4 comments
Open

Add a --check subcommand #157

juntyr opened this issue Jun 16, 2024 · 4 comments

Comments

@juntyr
Copy link

juntyr commented Jun 16, 2024

First of all, thank you for creating this useful tool!

I've been experimenting with wit-deps for a while. At the moment, using it as a tightly integrated dependency manager for my wit deps from within my Rust deps doesn't quite work yet. Therefore, I'm trying to switch to just using wit-deps in CI for now to assert that my deps.lock file is accurate and my deps folder is up-to-date. At the moment, it seems that checking the deps folder requires running wit-deps update, which fully recreates the folder. In my usecase, this wont't work, unfortunately.

Would it be possible to add a wit-deps check subcommand? It should work almost identically to wit-deps update, but instead of overriding folders and files, it would check that their contents are the same. Crucially, it should follow symlinks.

Thanks for your help and feedback :)

@rvolosatovs
Copy link
Member

Thank you for the report @juntyr!
Perhaps wit-deps lock --check could address your use case?
See example usage in CI e.g. here: https://github.com/WebAssembly/wasi-http/blob/a81c61fc584278f801393b22289840a439e51f50/.github/workflows/main.yml#L18

@juntyr
Copy link
Author

juntyr commented Jun 21, 2024

Thanks for pointing me in the direction of wit-deps lock --check! It almost solves my usecase, but also seems to modify the wit/deps folder. Could the check variant ensure that no changes to the current files are made?

@rvolosatovs
Copy link
Member

Thanks for pointing me in the direction of wit-deps lock --check! It almost solves my usecase, but also seems to modify the wit/deps folder. Could the check variant ensure that no changes to the current files are made?

That sounds like the right behavior actually (not doing any changes), but I don't have capacity to implement that any time soon. I suppose the way it would work for now would be failing fast if any of the paths specified in deps.toml are not in wit/deps and otherwise construct a tar (until we have #25), hash it and compare to what's in deps.lock.
It looks like the failfast is the only missing piece to allow for that, since I think right now it's probably just fetching everything missing and then hashing.

Would you be interested in contributing this change?

@juntyr
Copy link
Author

juntyr commented Jun 26, 2024

I unfortunately also don't have the time at the moment, but I can share the code that I'm using (ignore my custom error types), which runs wit-deps in a tempdir and then compares the changes but also ignored deep dependencies since they will be flattened anyways:

use std::{
    fmt,
    fs::{self, File},
    hash::{BuildHasher, DefaultHasher, Hasher, RandomState},
    io::{self, BufReader},
    path::Path,
};

use core_error::LocationError;
use tempdir::TempDir;
use walkdir::{DirEntry, WalkDir};

pub type Result<T, E = LocationError<AnyError>> = std::result::Result<T, E>;

#[allow(clippy::too_many_lines)]
#[allow(clippy::missing_errors_doc)]
pub fn check_is_locked(wit: impl AsRef<Path>) -> Result<()> {
    let wit = wit.as_ref();

    let deps_file = fs::read_to_string(wit.join("deps.toml")).map_err(AnyError::new)?;
    let deps_lock = fs::read_to_string(wit.join("deps.lock")).map_err(AnyError::new)?;

    let deps = TempDir::new("deps").map_err(AnyError::new)?;

    let lock = wit_deps::lock(
        Some(&wit),
        deps_file,
        Some(deps_lock),
        deps.path().join("deps"),
    );
    let lock = tokio::runtime::Builder::new_current_thread()
        .enable_io()
        .enable_time()
        .build()
        .map_err(AnyError::new)?
        .block_on(lock)
        .map_err(AnyError::from)?;

    if lock.is_some() {
        return Err(AnyError::msg("lock file has changed").into());
    }

    let old_wit = wit.join("deps");
    let new_wit = deps.path().join("deps");

    let mut old_deps = WalkDir::new(&old_wit)
        .min_depth(1)
        .follow_links(true)
        .sort_by_file_name()
        .into_iter();
    let mut new_deps = WalkDir::new(&new_wit)
        .min_depth(1)
        .follow_links(true)
        .sort_by_file_name()
        .into_iter();

    let (mut old_dep, mut new_dep) = (
        old_deps.next().transpose().map_err(AnyError::new)?,
        new_deps.next().transpose().map_err(AnyError::new)?,
    );

    loop {
        // skip indirect dependency deps files, lock files, and directories,
        //  since indirect dependencies are flattened into the main dendencies
        let skip = |dep: &Option<DirEntry>| match dep {
            Some(dep) if dep.path().ends_with("deps") && dep.file_type().is_dir() => {
                Some(Skip::Directory)
            },
            Some(dep)
                if (dep.path().ends_with("deps.toml") && dep.file_type().is_file())
                    || (dep.path().ends_with("deps.lock") && dep.file_type().is_file()) =>
            {
                Some(Skip::File)
            },
            _ => None,
        };

        if let Some(old_skip) = skip(&old_dep) {
            if matches!(old_skip, Skip::Directory) {
                old_deps.skip_current_dir();
            }
            old_dep = old_deps.next().transpose().map_err(AnyError::new)?;
            continue;
        }
        if let Some(new_skip) = skip(&new_dep) {
            if matches!(new_skip, Skip::Directory) {
                new_deps.skip_current_dir();
            }
            new_dep = new_deps.next().transpose().map_err(AnyError::new)?;
            continue;
        }

        // check that both have the same number of files
        let (some_old_dep, some_new_dep) = match (old_dep, new_dep) {
            (Some(old_dep), Some(new_dep)) => (old_dep, new_dep),
            (None, None) => break,
            (Some(extra), None) => {
                return Err(AnyError::msg(format!(
                    "{} is extraneous in deps",
                    extra.path().display()
                ))
                .into())
            },
            (None, Some(missing)) => {
                return Err(AnyError::msg(format!(
                    "{} is missing from deps",
                    missing.path().display()
                ))
                .into())
            },
        };

        // strip the file path prefixes to make them comparable
        let old_dep_path = some_old_dep
            .path()
            .strip_prefix(&old_wit)
            .map_err(AnyError::new)?;
        let new_dep_path = some_new_dep
            .path()
            .strip_prefix(&new_wit)
            .map_err(AnyError::new)?;

        // check that the next file path and type match
        if old_dep_path != new_dep_path {
            return Err(AnyError::msg(format!(
                "file name mismatch between {} and {} in deps",
                old_dep_path.display(),
                new_dep_path.display(),
            ))
            .into());
        }
        if some_old_dep.file_type() != some_new_dep.file_type() {
            return Err(AnyError::msg(format!(
                "file type mismatch for {}",
                old_dep_path.display()
            ))
            .into());
        }

        // we can only compare the binary contents of files
        if !some_old_dep.file_type().is_file() {
            old_dep = old_deps.next().transpose().map_err(AnyError::new)?;
            new_dep = new_deps.next().transpose().map_err(AnyError::new)?;

            continue;
        }

        let mut old_file = BufReader::new(File::open(some_old_dep.path()).map_err(AnyError::new)?);
        let mut new_file = BufReader::new(File::open(some_new_dep.path()).map_err(AnyError::new)?);

        let rng = RandomState::new();
        let mut old_hasher = HashWriter {
            hasher: rng.build_hasher(),
        };
        let mut new_hasher = HashWriter {
            hasher: rng.build_hasher(),
        };

        // hash the file contents
        io::copy(&mut old_file, &mut old_hasher).map_err(AnyError::new)?;
        io::copy(&mut new_file, &mut new_hasher).map_err(AnyError::new)?;

        let (old_hash, new_hash) = (old_hasher.hasher.finish(), new_hasher.hasher.finish());

        // check that the file content hashes match
        if old_hash != new_hash {
            return Err(AnyError::msg(format!(
                "file hash mismatch for {}",
                old_dep_path.display()
            ))
            .into());
        }

        old_dep = old_deps.next().transpose().map_err(AnyError::new)?;
        new_dep = new_deps.next().transpose().map_err(AnyError::new)?;
    }

    deps.close().map_err(AnyError::new)?;

    Ok(())
}

#[derive(Debug, thiserror::Error)]
#[error(transparent)]
pub struct AnyError(#[from] anyhow::Error);

impl AnyError {
    pub fn new<E: 'static + std::error::Error + Send + Sync>(error: E) -> Self {
        Self(anyhow::Error::new(error))
    }

    pub fn msg<M: 'static + fmt::Display + fmt::Debug + Send + Sync>(message: M) -> Self {
        Self(anyhow::Error::msg(message))
    }
}

enum Skip {
    File,
    Directory,
}

struct HashWriter {
    hasher: DefaultHasher,
}

impl io::Write for HashWriter {
    fn write(&mut self, bytes: &[u8]) -> Result<usize, io::Error> {
        self.hasher.write(bytes);
        Ok(bytes.len())
    }

    fn flush(&mut self) -> Result<(), io::Error> {
        Ok(())
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants