Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore hidden files during serialization #356

Merged
merged 16 commits into from
Jun 15, 2023
Merged

Conversation

aspctu
Copy link
Collaborator

@aspctu aspctu commented Jun 9, 2023

This PR adds support to ignore hidden files (amongst other files defined in the Python .gitignore) during the gather() step. Previously, these hidden files would alter behavior of patching and control server and would introduce extra files during the build step. This PR also checks to see if it needs to remove/copy the data directory during the gather process. This op can be quite costly (data directories may contain large binaries).

One important thing to note here is that we now always gather the user truss. We do this because as part of this PR, we do not want to alter the users original Truss repo. For example, a user's truss may also be a git repo and because we remove .git files as part of the .truss_ignore, we'd be altering the user's setup. Instead, now we always create a replica truss via gather and then filter that truss accordingly before serializing.

In total, this PR adds:

  1. Support for .truss_ignore to remove unnecessary files
  2. Checks to remove/copy the data directory during the gather process

truss/util/python.gitignore Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/truss_handle.py Outdated Show resolved Hide resolved
@aspctu aspctu requested a review from bolasim June 13, 2023 00:25
Copy link
Collaborator

@bolasim bolasim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better. Willl also wait for @pankajroark to LGTM.

I thikn it'd be worthwhile to also add that test I describe in the comment.

truss/tests/test_truss_util.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@joostinyi joostinyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the changes that you made actually get applied without removing the is_scattered check from the decorator proxy_to_shadow_if_scattered? I don't see where we are calling TrussHandle.gather directly

truss/util/path.py Outdated Show resolved Hide resolved
truss/truss_handle.py Outdated Show resolved Hide resolved
@aspctu aspctu requested review from bolasim and joostinyi June 13, 2023 18:17
Copy link
Collaborator

@bolasim bolasim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a few changes requested in the test for ergo

truss/tests/conftest.py Show resolved Hide resolved
truss/tests/test_truss_util.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
assert ".git/" in patterns


def test_are_dirs_equal(custom_model_truss_dir):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that this is better in general, but for consistency: would be good to mirror target directory under test because that's the convention. So tests/util/test_path.py

truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Outdated Show resolved Hide resolved
truss/util/path.py Show resolved Hide resolved
return True
elif fnmatch.fnmatch(path.name, pattern):
return True
path = path.parent if path.parent != path else None # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this part. Would this keep going tup until it reaches `/'? Is that what we want?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, and this isn't what we want. I added support to pass in a base_dir to this function so that the path becomes relative to the user truss. I've also added a test to check this case, thanks for catching.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be good to make the doc comment more descriptive about the protocol and what each parameter means.

return True
elif fnmatch.fnmatch(path.name, pattern):
return True
path = path.parent if path.parent != path else None # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be good to make the doc comment more descriptive about the protocol and what each parameter means.

truss/util/path.py Outdated Show resolved Hide resolved
@aspctu aspctu merged commit 7079564 into main Jun 15, 2023
5 checks passed
@aspctu aspctu deleted the abuqader/ignore-hidden-files branch June 15, 2023 20:05
aspctu added a commit that referenced this pull request Jun 16, 2023
aspctu added a commit that referenced this pull request Jun 16, 2023
@aspctu aspctu restored the abuqader/ignore-hidden-files branch June 16, 2023 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants