Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow artifacts (JARs, wheels) to be uploaded to UC Volumes #1591

Merged
merged 4 commits into from
Jul 16, 2024

Conversation

andrewnester
Copy link
Contributor

Changes

This change allows to specify UC volumes path as an artifact paths so all artifacts (JARs, wheels) are uploaded to UC Volumes.

Example configuration is here:

bundle:
  name: jar-bundle

workspace:
  host: https://foo.com
  artifact_path: /Volumes/main/default/foobar

artifacts:
  my_java_code:
    path: ./sample-java
    build: "javac PrintArgs.java && jar cvfm PrintArgs.jar META-INF/MANIFEST.MF PrintArgs.class"
    files:
      - source: ./sample-java/PrintArgs.jar

resources:
  jobs:
    jar_job:
      name: "Test Spark Jar Job"
      tasks:
        - task_key: TestSparkJarTask
          new_cluster:
            num_workers: 1
            spark_version: "14.3.x-scala2.12"
            node_type_id: "i3.xlarge"
          spark_jar_task:
            main_class_name: PrintArgs
          libraries:
            - jar: ./sample-java/PrintArgs.jar

Tests

Manually + added E2E test for Java jobs

E2E test is temporarily skipped until auth related issues for UC for tests are resolved

bundle/artifacts/artifacts.go Show resolved Hide resolved
bundle/artifacts/artifacts.go Show resolved Hide resolved
// Skip building if build command is not specified or infered
if artifact.BuildCommand == "" {
// If no build command was specified or infered and there is no
// artifact output files specified, artifact is misconfigured
if len(artifact.Files) == 0 {
return diag.Errorf("misconfigured artifact: please specify 'build' or 'files' property")
}
return nil
return expandGlobReference(artifact)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a side effect on the artifact, which is not evident from it being called as a placeholder for the return value. I recommend calling this in the original spot and accumulating diags to make it clear it does something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pietern we can't really call it in the original spot because it needs to be called (also) after build mutator because it adds additional source files which might need have been built yet

bundle/artifacts/upload.go Outdated Show resolved Hide resolved
bundle/artifacts/upload.go Outdated Show resolved Hide resolved
bundle/artifacts/upload.go Outdated Show resolved Hide resolved
}

// We intentionally ignore the error because it is not critical to the deployment
err = client.Delete(ctx, ".", filer.DeleteRecursively)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is no different from what we used to do. But still, it makes me uneasy (not to mention on volumes it will take forever if the underlying directory happens to be large). Let's figure out a way to remove this call.

@andrewnester andrewnester added this pull request to the merge queue Jul 16, 2024
Merged via the queue into main with commit 434bcbb Jul 16, 2024
5 checks passed
@andrewnester andrewnester deleted the feature/uc-volumes-artifacts branch July 16, 2024 09:05
andrewnester added a commit that referenced this pull request Jul 18, 2024
CLI:
 * [Fix] Do not buffer files in memory when downloading ([#1599](#1599)).

Bundles:
 * Allow artifacts (JARs, wheels) to be uploaded to UC Volumes ([#1591](#1591)).
 * Upgrade TF provider to 1.48.3 ([#1600](#1600)).
 * Fixed job name normalisation for bundle generate ([#1601](#1601)).

Internal:
 * Add UUID to uniquely identify a deployment state ([#1595](#1595)).
 * Track multiple locations associated with a `dyn.Value` ([#1510](#1510)).
 * Attribute Terraform API requests the CLI ([#1598](#1598)).
 * Use local Terraform state only when lineage match ([#1588](#1588)).
 * Implement readahead cache for Workspace API calls ([#1582](#1582)).

Dependency updates:
 * Bump github.com/databricks/databricks-sdk-go from 0.43.0 to 0.43.2 ([#1594](#1594)).
@andrewnester andrewnester mentioned this pull request Jul 18, 2024
andrewnester added a commit that referenced this pull request Jul 18, 2024
CLI:
 * Do not buffer files in memory when downloading ([#1599](#1599)).

Bundles:
 * Allow artifacts (JARs, wheels) to be uploaded to UC Volumes ([#1591](#1591)).
 * Upgrade TF provider to 1.48.3 ([#1600](#1600)).
 * Fixed job name normalisation for bundle generate ([#1601](#1601)).

Internal:
 * Add UUID to uniquely identify a deployment state ([#1595](#1595)).
 * Track multiple locations associated with a `dyn.Value` ([#1510](#1510)).
 * Attribute Terraform API requests the CLI ([#1598](#1598)).
 * Implement readahead cache for Workspace API calls ([#1582](#1582)).
 * Use local Terraform state only when lineage match ([#1588](#1588)).

Dependency updates:
 * Bump github.com/databricks/databricks-sdk-go from 0.43.0 to 0.43.2 ([#1594](#1594)).
@andrewnester andrewnester mentioned this pull request Jul 18, 2024
github-merge-queue bot pushed a commit that referenced this pull request Jul 18, 2024
CLI:
* Do not buffer files in memory when downloading
([#1599](#1599)).

Bundles:
* Allow artifacts (JARs, wheels) to be uploaded to UC Volumes
([#1591](#1591)).
* Upgrade TF provider to 1.48.3
([#1600](#1600)).
* Fixed job name normalisation for bundle generate
([#1601](#1601)).

Internal:
* Add UUID to uniquely identify a deployment state
([#1595](#1595)).
* Track multiple locations associated with a `dyn.Value`
([#1510](#1510)).
* Attribute Terraform API requests the CLI
([#1598](#1598)).
* Implement readahead cache for Workspace API calls
([#1582](#1582)).
* Use local Terraform state only when lineage match
([#1588](#1588)).
* Add read-only mode for extension aware workspace filer
([#1609](#1609)).


Dependency updates:
* Bump github.com/databricks/databricks-sdk-go from 0.43.0 to 0.43.2
([#1594](#1594)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants