Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .codegen.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"version": {
"src/databricks/labs/lsql/__about__.py": "__version__ = \"$VERSION\""
},
"toolchain": {
"required": ["python3"],
"pre_setup": [
"python3 -m pip install hatch==1.7.0",
"python3 -m hatch env create"
],
"prepend_path": ".venv/bin",
"acceptance_path": "tests/integration",
"test": [
"pytest -n 4 --cov src --cov-report=xml --timeout 30 tests/unit --durations 20"
]
}
}
45 changes: 45 additions & 0 deletions .github/workflows/acceptance.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: acceptance

on:
pull_request:
types: [ opened, synchronize, ready_for_review ]

permissions:
id-token: write
contents: read
pull-requests: write

concurrency:
group: single-acceptance-job-per-repo

jobs:
integration:
if: github.event_name == 'pull_request' && github.event.pull_request.draft == false
environment: runtime
runs-on: larger
steps:
- name: Checkout Code
uses: actions/checkout@v2.5.0

- name: Unshallow
run: git fetch --prune --unshallow

- name: Install Python
uses: actions/setup-python@v4
with:
cache: 'pip'
cache-dependency-path: '**/pyproject.toml'
python-version: '3.10'

- name: Install hatch
run: pip install hatch==1.7.0

- name: Run integration tests
uses: databrickslabs/sandbox/acceptance@acceptance/v0.1.4
with:
vault_uri: ${{ secrets.VAULT_URI }}
timeout: 45m
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }}
ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
28 changes: 9 additions & 19 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,12 @@ on:
branches:
- main

env:
HATCH_VERSION: 1.7.0

jobs:
ci:
strategy:
fail-fast: false
matrix:
pyVersion: [ '3.8', '3.9', '3.10', '3.11', '3.12' ]
pyVersion: [ '3.10', '3.11', '3.12' ]
runs-on: ubuntu-latest
steps:
- name: Checkout
Expand All @@ -34,11 +32,10 @@ jobs:
cache-dependency-path: '**/pyproject.toml'
python-version: ${{ matrix.pyVersion }}

- name: Install hatch
run: pip install hatch==$HATCH_VERSION

- name: Run unit tests
run: hatch run unit:test
run: |
pip install hatch==1.7.0
make test

- name: Publish test coverage
uses: codecov/codecov-action@v1
Expand All @@ -49,15 +46,8 @@ jobs:
- name: Checkout
uses: actions/checkout@v3

- name: Install Python
uses: actions/setup-python@v4
with:
cache: 'pip'
cache-dependency-path: '**/pyproject.toml'
python-version: 3.10.x

- name: Install hatch
run: pip install hatch==$HATCH_VERSION
- name: Format all files
run: make dev fmt

- name: Verify linting
run: hatch run lint:verify
- name: Fail on differences
run: git diff --exit-code
48 changes: 48 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: Release

on:
push:
tags:
- 'v*'

jobs:
publish:
runs-on: ubuntu-latest
environment: release
permissions:
# Used to authenticate to PyPI via OIDC and sign the release's artifacts with sigstore-python.
id-token: write
# Used to attach signing artifacts to the published release.
contents: write
steps:
- uses: actions/checkout@v3

- uses: actions/setup-python@v4
with:
cache: 'pip'
cache-dependency-path: '**/pyproject.toml'
python-version: '3.10'

- name: Build wheels
run: |
pip install hatch==1.7.0
hatch build

- name: Draft release
uses: softprops/action-gh-release@v1
with:
draft: true
files: |
dist/databricks_*.whl
dist/databricks_*.tar.gz

- uses: pypa/gh-action-pypi-publish@release/v1
name: Publish package distributions to PyPI

- name: Sign artifacts with Sigstore
uses: sigstore/gh-action-sigstore-python@v2.1.1
with:
inputs: |
dist/databricks_*.whl
dist/databricks_*.tar.gz
release-signing-artifacts: true
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Version changelog

## 0.0.0

Initial commit
118 changes: 116 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,117 @@
To setup local dev environment, you have to install `hatch` tooling: `pip install hatch`.
# Contributing

After, you have to configure your IDE with it: `hatch run python -c "import sys; print(sys.executable)" | pbcopy`
## First Principles

Favoring standard libraries over external dependencies, especially in specific contexts like Databricks, is a best practice in software
development.

There are several reasons why this approach is encouraged:
- Standard libraries are typically well-vetted, thoroughly tested, and maintained by the official maintainers of the programming language or platform. This ensures a higher level of stability and reliability.
- External dependencies, especially lesser-known or unmaintained ones, can introduce bugs, security vulnerabilities, or compatibility issues that can be challenging to resolve. Adding external dependencies increases the complexity of your codebase.
- Each dependency may have its own set of dependencies, potentially leading to a complex web of dependencies that can be difficult to manage. This complexity can lead to maintenance challenges, increased risk, and longer build times.
- External dependencies can pose security risks. If a library or package has known security vulnerabilities and is widely used, it becomes an attractive target for attackers. Minimizing external dependencies reduces the potential attack surface and makes it easier to keep your code secure.
- Relying on standard libraries enhances code portability. It ensures your code can run on different platforms and environments without being tightly coupled to specific external dependencies. This is particularly important in settings like Databricks, where you may need to run your code on different clusters or setups.
- External dependencies may have their versioning schemes and compatibility issues. When using standard libraries, you have more control over versioning and can avoid conflicts between different dependencies in your project.
- Fewer external dependencies mean faster build and deployment times. Downloading, installing, and managing external packages can slow down these processes, especially in large-scale projects or distributed computing environments like Databricks.
- External dependencies can be abandoned or go unmaintained over time. This can lead to situations where your project relies on outdated or unsupported code. When you depend on standard libraries, you have confidence that the core functionality you rely on will continue to be maintained and improved.

While minimizing external dependencies is essential, exceptions can be made case-by-case. There are situations where external dependencies are
justified, such as when a well-established and actively maintained library provides significant benefits, like time savings, performance improvements,
or specialized functionality unavailable in standard libraries.

## Common fixes for `mypy` errors

See https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html for more details

### ..., expression has type "None", variable has type "str"

* Add `assert ... is not None` if it's a body of a method. Example:

```
# error: Argument 1 to "delete" of "DashboardWidgetsAPI" has incompatible type "str | None"; expected "str"
self._ws.dashboard_widgets.delete(widget.id)
```

after

```
assert widget.id is not None
self._ws.dashboard_widgets.delete(widget.id)
```

* Add `... | None` if it's in the dataclass. Example: `cloud: str = None` -> `cloud: str | None = None`

### ..., has incompatible type "Path"; expected "str"

Add `.as_posix()` to convert Path to str

### Argument 2 to "get" of "dict" has incompatible type "None"; expected ...

Add a valid default value for the dictionary return.

Example:
```python
def viz_type(self) -> str:
return self.viz.get("type", None)
```

after:

Example:
```python
def viz_type(self) -> str:
return self.viz.get("type", "UNKNOWN")
```

## Local Setup

This section provides a step-by-step guide to set up and start working on the project. These steps will help you set up your project environment and dependencies for efficient development.

To begin, run `make dev` create the default environment and install development dependencies, assuming you've already cloned the github repo.

```shell
make dev
```

Verify installation with
```shell
make test
```

Before every commit, apply the consistent formatting of the code, as we want our codebase look consistent:
```shell
make fmt
```

Before every commit, run automated bug detector (`make lint`) and unit tests (`make test`) to ensure that automated
pull request checks do pass, before your code is reviewed by others:
```shell
make test
```

## First contribution

Here are the example steps to submit your first contribution:

1. Make a Fork from ucx repo (if you really want to contribute)
2. `git clone`
3. `git checkout main` (or `gcm` if you're using [ohmyzsh](https://ohmyz.sh/)).
4. `git pull` (or `gl` if you're using [ohmyzsh](https://ohmyz.sh/)).
5. `git checkout -b FEATURENAME` (or `gcb FEATURENAME` if you're using [ohmyzsh](https://ohmyz.sh/)).
6. .. do the work
7. `make fmt`
8. `make lint`
9. .. fix if any
10. `make test`
11. .. fix if any
12. `git commit -a`. Make sure to enter meaningful commit message title.
13. `git push origin FEATURENAME`
14. Go to GitHub UI and create PR. Alternatively, `gh pr create` (if you have [GitHub CLI](https://cli.github.com/) installed).
Use a meaningful pull request title because it'll appear in the release notes. Use `Resolves #NUMBER` in pull
request description to [automatically link it](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue)
to an existing issue.
15. announce PR for the review

## Troubleshooting

If you encounter any package dependency errors after `git pull`, run `make clean`
82 changes: 63 additions & 19 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,25 +1,69 @@
DB license
Databricks License
Copyright (2024) Databricks, Inc.

Copyright (2023) Databricks, Inc.
Definitions.

Agreement: The agreement between Databricks, Inc., and you governing
the use of the Databricks Services, as that term is defined in
the Master Cloud Services Agreement (MCSA) located at
www.databricks.com/legal/mcsa.

Licensed Materials: The source code, object code, data, and/or other
works to which this license applies.

Definitions.
Scope of Use. You may not use the Licensed Materials except in
connection with your use of the Databricks Services pursuant to
the Agreement. Your use of the Licensed Materials must comply at all
times with any restrictions applicable to the Databricks Services,
generally, and must be used in accordance with any applicable
documentation. You may view, use, copy, modify, publish, and/or
distribute the Licensed Materials solely for the purposes of using
the Licensed Materials within or connecting to the Databricks Services.
If you do not agree to these terms, you may not view, use, copy,
modify, publish, and/or distribute the Licensed Materials.

Redistribution. You may redistribute and sublicense the Licensed
Materials so long as all use is in compliance with these terms.
In addition:

- You must give any other recipients a copy of this License;
- You must cause any modified files to carry prominent notices
stating that you changed the files;
- You must retain, in any derivative works that you distribute,
all copyright, patent, trademark, and attribution notices,
excluding those notices that do not pertain to any part of
the derivative works; and
- If a "NOTICE" text file is provided as part of its
distribution, then any derivative works that you distribute
must include a readable copy of the attribution notices
contained within such NOTICE file, excluding those notices
that do not pertain to any part of the derivative works.

Agreement: The agreement between Databricks, Inc., and you governing the use of the Databricks Services, which shall be, with respect to Databricks, the Databricks Terms of Service located at www.databricks.com/termsofservice, and with respect to Databricks Community Edition, the Community Edition Terms of Service located at www.databricks.com/ce-termsofuse, in each case unless you have entered into a separate written agreement with Databricks governing the use of the applicable Databricks Services.
You may add your own copyright statement to your modifications and may
provide additional license terms and conditions for use, reproduction,
or distribution of your modifications, or for any such derivative works
as a whole, provided your use, reproduction, and distribution of
the Licensed Materials otherwise complies with the conditions stated
in this License.

Software: The source code and object code to which this license applies.
Termination. This license terminates automatically upon your breach of
these terms or upon the termination of your Agreement. Additionally,
Databricks may terminate this license at any time on notice. Upon
termination, you must permanently delete the Licensed Materials and
all copies thereof.

Scope of Use. You may not use this Software except in connection with your use of the Databricks Services pursuant to the Agreement. Your use of the Software must comply at all times with any restrictions applicable to the Databricks Services, generally, and must be used in accordance with any applicable documentation. You may view, use, copy, modify, publish, and/or distribute the Software solely for the purposes of using the code within or connecting to the Databricks Services. If you do not agree to these terms, you may not view, use, copy, modify, publish, and/or distribute the Software.
DISCLAIMER; LIMITATION OF LIABILITY.

Redistribution. You may redistribute and sublicense the Software so long as all use is in compliance with these terms. In addition:

You must give any other recipients a copy of this License;
You must cause any modified files to carry prominent notices stating that you changed the files;
You must retain, in the source code form of any derivative works that you distribute, all copyright, patent, trademark, and attribution notices from the source code form, excluding those notices that do not pertain to any part of the derivative works; and
If the source code form includes a "NOTICE" text file as part of its distribution, then any derivative works that you distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the derivative works.
You may add your own copyright statement to your modifications and may provide additional license terms and conditions for use, reproduction, or distribution of your modifications, or for any such derivative works as a whole, provided your use, reproduction, and distribution of the Software otherwise complies with the conditions stated in this License.

Termination. This license terminates automatically upon your breach of these terms or upon the termination of your Agreement. Additionally, Databricks may terminate this license at any time on notice. Upon termination, you must permanently delete the Software and all copies thereof.

DISCLAIMER; LIMITATION OF LIABILITY.

THE SOFTWARE IS PROVIDED “AS-IS” AND WITH ALL FAULTS. DATABRICKS, ON BEHALF OF ITSELF AND ITS LICENSORS, SPECIFICALLY DISCLAIMS ALL WARRANTIES RELATING TO THE SOURCE CODE, EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, IMPLIED WARRANTIES, CONDITIONS AND OTHER TERMS OF MERCHANTABILITY, SATISFACTORY QUALITY OR FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. DATABRICKS AND ITS LICENSORS TOTAL AGGREGATE LIABILITY RELATING TO OR ARISING OUT OF YOUR USE OF OR DATABRICKS’ PROVISIONING OF THE SOURCE CODE SHALL BE LIMITED TO ONE THOUSAND ($1,000) DOLLARS. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
THE LICENSED MATERIALS ARE PROVIDED “AS-IS” AND WITH ALL FAULTS.
DATABRICKS, ON BEHALF OF ITSELF AND ITS LICENSORS, SPECIFICALLY
DISCLAIMS ALL WARRANTIES RELATING TO THE LICENSED MATERIALS, EXPRESS
AND IMPLIED, INCLUDING, WITHOUT LIMITATION, IMPLIED WARRANTIES,
CONDITIONS AND OTHER TERMS OF MERCHANTABILITY, SATISFACTORY QUALITY OR
FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. DATABRICKS AND
ITS LICENSORS TOTAL AGGREGATE LIABILITY RELATING TO OR ARISING OUT OF
YOUR USE OF OR DATABRICKS’ PROVISIONING OF THE LICENSED MATERIALS SHALL
BE LIMITED TO ONE THOUSAND ($1,000) DOLLARS. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE LICENSED MATERIALS OR
THE USE OR OTHER DEALINGS IN THE LICENSED MATERIALS.
Loading