Skip to content

support .tar, .gz, and .zip chart imports #749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ritesh-karankal
Copy link

Description

This PR fixes #748

Notes for Reviewers

  • ParseFileAsHelmChart() previously attempted to load .gz, .zip, .tar, and similar archive types using loader.LoadArchive() on raw bytes. However, this approach failed for several reasons:

    • These archives often extract into a folder with the chart inside a nested directory (e.g., my-chart/Chart.yaml), which LoadArchive() cannot handle.

    • LoadArchive() expects a .tgz archive with a flat, Helm-native layout — not arbitrary .zip or .tar structures.

    • It does not support directories (ExtractedContentPath) or already-extracted content.

    • .zip files are not tarballs and are not natively supported by Helm’s loader.

This has been fixed by replacing loader.LoadArchive() with loader.Load(), which works with directory paths. It now uses ExtractedContentPath along with a recursive FindChartDir() function to locate and load the Chart.yaml from nested folders after extraction.

  • Introduced a new identifyExtension() function that prioritizes longer extensions first (e.g., .tar.gz, .tar.tgz), ensuring format recognition.
  • Added cleanup of macOS hidden files and directory name sanitization

Please review the changes and share any feedback or suggestions.

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Ritesh Karankal <ritesh.karankal200@gmail.com>
Copy link

welcome bot commented Jun 25, 2025

Yay, your first pull request! 👍 A contributor will be by to give feedback soon. In the meantime, you can find updates in the #github-notifications channel in the community Slack.
Be sure to double-check that you have signed your commits. Here are instructions for making signing an implicit activity while performing a commit.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @ritesh-karankal, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system's ability to import Helm charts by expanding support to include common archive formats like .tar, .gz, and .zip. The changes address limitations with nested chart directories and introduce robust file and directory sanitization for extracted content, improving overall compatibility and reliability of chart imports.

Highlights

  • Helm Chart Import Enhancement: I've updated the ParseFileAsHelmChart function to correctly handle Helm charts packaged in .tar, .gz, and .zip archives. Previously, loader.LoadArchive() was used, which struggled with non-.tgz formats and nested chart directories. Now, loader.Load() is used, which works with extracted directory paths.
  • Recursive Chart Directory Discovery: I've introduced a new FindChartDir function that recursively searches within an extracted archive's content to locate the actual directory containing Chart.yaml. This ensures that charts nested within subdirectories (e.g., my-chart/Chart.yaml inside an archive) can be successfully identified and loaded.
  • Improved File Extension Identification: A new identifyExtension helper function has been added to accurately determine file extensions, prioritizing longer, more specific extensions like .tar.gz or .tar.tgz over shorter ones like .gz. This improves the robustness of file type recognition.
  • Extracted Content Sanitization: To improve compatibility and prevent issues, I've added two new functions: SanitizeDirNames to replace spaces in directory names with hyphens, and removeHiddenFiles to delete macOS-specific hidden files (like .DS_Store and ._*) from extracted archive contents. These sanitization steps are applied after archive extraction.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR does a great job of adding support for various archive formats (.tar, .gz, .zip) for Helm chart imports. The core logic change from loader.LoadArchive to loader.Load, combined with a recursive search for Chart.yaml, is a solid and correct approach to handle various chart archive structures.

The addition of identifyExtension for better file type detection, sanitization of directory names, and removal of hidden macOS files makes the import process more robust and reliable. The code is well-structured, and the error handling has been improved significantly.

I have a couple of suggestions to further enhance code clarity and performance. Overall, this is an excellent contribution.

return nil
})

// Move the error check here so errStopWalk is in scope
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment is slightly misleading. errStopWalk is in scope throughout the function. The purpose of this check is to distinguish the sentinel error, which indicates a successful early exit from the walk, from actual filesystem errors that might occur.

Improving the comment will enhance clarity for future maintenance.

Suggested change
// Move the error check here so errStopWalk is in scope
// The walk is expected to stop with errStopWalk if a chart is found; any other error is unexpected.

normalizedName := strings.ToLower(strings.TrimSpace(name))

// Prioritize longer, more specific extensions first to correctly identify ".tar.gz" over ".gz"
knownExtensions := []string{".tar.gz", ".tar.tgz", ".tgz", ".tar", ".zip", ".gz", ".yml", ".yaml", ".json"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The knownExtensions slice is re-initialized on every call to identifyExtension. For better performance and to follow Go best practices, this should be defined once at the package level as a variable.

This avoids repeated memory allocations and makes the list of extensions easier to manage.

For example:

// at package level
var knownExtensions = []string{".tar.gz", ".tar.tgz", ".tgz", ".tar", ".zip", ".gz", ".yml", ".yaml", ".json"}

func identifyExtension(name string) string {
    // ... use knownExtensions here without re-defining it
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to Import Helm Charts from .tar, .gz, and .zip Archives
1 participant