Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dlt commands as Jupyter Notebook magic commands #793

Open
wants to merge 91 commits into
base: devel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
401b366
added first version
Vasilije1990 Nov 23, 2023
8f4f11f
Added initial version of logic for the notebook magics that work when…
Vasilije1990 Nov 25, 2023
c61e0f1
Added initial version of logic for the notebook magics that work when…
Vasilije1990 Nov 26, 2023
3f96e37
Added initial version of logic for the notebook magics that work when…
Vasilije1990 Nov 26, 2023
08129be
Added initial version of logic for the notebook magics that work when…
Vasilije1990 Nov 26, 2023
1807ffb
Added initial version of logic for the notebook magics that work when…
Vasilije1990 Nov 26, 2023
0f390ff
Added fixes and updates, some of the magics now run
Vasilije1990 Nov 27, 2023
395b12f
Added init command to the magics, tested and added symlink for the _d…
Vasilije1990 Nov 27, 2023
508d8d7
Improve the is_notebook function to differentiate between pure ipytho…
Vasilije1990 Nov 28, 2023
2e937dc
Add is_ipython to be able to register magics for ipython console type
Vasilije1990 Nov 28, 2023
e026c31
Added notebook confirm function
Vasilije1990 Nov 28, 2023
e7118e6
Added changes to the init command
Vasilije1990 Nov 28, 2023
e840e81
Added changes to the init command
Vasilije1990 Nov 28, 2023
25d07e1
Added initial docs + magics changes
Vasilije1990 Nov 29, 2023
e7a65d2
Added initial docs + magics changes
Vasilije1990 Nov 29, 2023
90f181a
Cleanup
Vasilije1990 Nov 29, 2023
6db42dd
Cleanup
Vasilije1990 Nov 29, 2023
0690258
Cleanup
Vasilije1990 Nov 29, 2023
aff8893
Fixes to pass the tests
Vasilije1990 Nov 29, 2023
65ad795
Fixes to pass the tests
Vasilije1990 Nov 29, 2023
dd14832
Fixes to pass the tests
Vasilije1990 Nov 29, 2023
c3edfaa
Fixes to pass the tests
Vasilije1990 Nov 29, 2023
de719e9
Fixed types for exception
Vasilije1990 Nov 29, 2023
ad757d3
Fixed types for exception
Vasilije1990 Nov 29, 2023
2cd2c19
Fixed types for exception
Vasilije1990 Nov 29, 2023
0678129
Renamed functions that register magics, added docstrings, moved symli…
Vasilije1990 Nov 30, 2023
1e35c78
Comment out deploy
Vasilije1990 Nov 30, 2023
7e75e3c
Fix using DOT_DLT
Vasilije1990 Dec 2, 2023
b27150c
Move deps to notebook extras in poetry
Vasilije1990 Dec 2, 2023
3e03aa6
fix import to standardise
Vasilije1990 Dec 2, 2023
2be6518
Added tests for symlink
Vasilije1990 Dec 2, 2023
24b854e
Clean unused imports
Vasilije1990 Dec 2, 2023
afee85a
Clean the ifs, update the echo command
Vasilije1990 Dec 2, 2023
441e6c4
Clean the ifs, update the echo command
Vasilije1990 Dec 3, 2023
3f63974
Update the docs
Vasilije1990 Dec 4, 2023
1590603
Update the docs
Vasilije1990 Dec 4, 2023
947f295
Fix the missign ipython error and fix one import
Vasilije1990 Dec 4, 2023
971a20d
Fix docs, revert the dlt version
Vasilije1990 Dec 5, 2023
6d752a2
Fix docs, revert the dlt version
Vasilije1990 Dec 5, 2023
5137c4a
Fix docs, lock poetry
Vasilije1990 Dec 10, 2023
51d3514
Update lock, to pass tests
Vasilije1990 Dec 12, 2023
1503140
Fix linting
Vasilije1990 Dec 12, 2023
e922b9c
Fix linting
Vasilije1990 Dec 12, 2023
1e52ff0
Fix linting
Vasilije1990 Dec 12, 2023
76d1018
Fix linting
Vasilije1990 Dec 12, 2023
da2e077
edit doc
AstrakhantsevaAA Dec 12, 2023
eb8a09d
Add tests, fix returns, small formatting things
Vasilije1990 Dec 20, 2023
1a77b07
Create poetry notebook group
Vasilije1990 Dec 20, 2023
d5ebf87
Merge remote-tracking branch 'origin/devel' into issue-502
AstrakhantsevaAA Jan 2, 2024
1d6d6df
Next step for tests
Vasilije1990 Jan 2, 2024
152fa99
Fix notebook install
Vasilije1990 Jan 2, 2024
65778b0
Fix notebook install
Vasilije1990 Jan 2, 2024
7549e73
Fix notebook install
Vasilije1990 Jan 2, 2024
b363684
Fix notebook install
Vasilije1990 Jan 2, 2024
feb277b
Fix notebook install
Vasilije1990 Jan 2, 2024
bcb0715
Fix notebook install
Vasilije1990 Jan 2, 2024
d87e78c
rename destination_name to destination_type, delete unused keys from …
AstrakhantsevaAA Jan 3, 2024
917dc65
add tests, refactor
AstrakhantsevaAA Jan 3, 2024
df84a5e
use pytest for tests
AstrakhantsevaAA Jan 3, 2024
04333a2
refactoring
AstrakhantsevaAA Jan 3, 2024
e859675
delete unused stuff, refactoring
AstrakhantsevaAA Jan 3, 2024
1daadb3
delete unused deps
AstrakhantsevaAA Jan 3, 2024
0c6a811
add sentry to deps
AstrakhantsevaAA Jan 3, 2024
d0c8483
poetry lock file
AstrakhantsevaAA Jan 4, 2024
459e7a6
return poetry lock file
AstrakhantsevaAA Jan 4, 2024
0356072
try fix flake
AstrakhantsevaAA Jan 4, 2024
7f7c8bc
fix lint
AstrakhantsevaAA Jan 4, 2024
a02d1d3
fix imports
AstrakhantsevaAA Jan 4, 2024
22d1fef
update documentation
AstrakhantsevaAA Jan 4, 2024
899b25e
return old poetry lock
AstrakhantsevaAA Jan 4, 2024
5e77027
return ipython
AstrakhantsevaAA Jan 4, 2024
e6bbc26
update lock without jedi
AstrakhantsevaAA Jan 4, 2024
6b2fa34
update lock with jedi
AstrakhantsevaAA Jan 4, 2024
e52bdd3
Merge remote-tracking branch 'origin/devel' into issue-502
AstrakhantsevaAA Jan 5, 2024
69f449d
update lock, delete magic test workflow
AstrakhantsevaAA Jan 5, 2024
9b335a2
delete ipython
AstrakhantsevaAA Jan 5, 2024
86daab7
return notebook
AstrakhantsevaAA Jan 5, 2024
b0e95a0
fix poetry install packages
AstrakhantsevaAA Jan 5, 2024
0f65b62
update flake8-encodings
AstrakhantsevaAA Jan 5, 2024
0ba91d6
move flake8_encodings to patched version
sh-rp Jan 5, 2024
3d2a43d
black
AstrakhantsevaAA Jan 5, 2024
0ca1c16
uncomment rest tests
AstrakhantsevaAA Jan 5, 2024
e63d6e4
delete magic pipeline command tests
AstrakhantsevaAA Jan 5, 2024
a720335
delete unused dlt import
AstrakhantsevaAA Jan 5, 2024
1021e56
Merge remote-tracking branch 'origin/devel' into issue-502
AstrakhantsevaAA Jan 8, 2024
a504313
update lock
AstrakhantsevaAA Jan 8, 2024
9c01ea0
revert some changes
AstrakhantsevaAA Jan 11, 2024
0fa8fd2
ipython is optional
AstrakhantsevaAA Jan 11, 2024
c7f45ae
add notebooks to lint workflow
AstrakhantsevaAA Jan 11, 2024
d5bf33f
delete symlink
AstrakhantsevaAA Jan 11, 2024
91d2f72
Revert "delete symlink"
AstrakhantsevaAA Jan 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
defaults:
run:
shell: bash
runs-on: ${{ matrix.os }}
runs-on: ${{ matrix.os }}

steps:

Expand All @@ -42,7 +42,7 @@ jobs:
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true
installer-parallel: true

- name: Load cached venv
id: cached-poetry-dependencies
Expand All @@ -53,11 +53,11 @@ jobs:

- name: Install dependencies
# if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --all-extras --with airflow,providers,pipeline,sentry-sdk
run: poetry install --all-extras --with airflow,providers,pipeline,sentry-sdk,notebook

- name: Run make lint
run: |
export PATH=$PATH:"/c/Program Files/usr/bin" # needed for Windows
export PATH=$PATH:"/c/Program Files/usr/bin" # needed for Windows
make lint

# - name: print envs
Expand All @@ -75,4 +75,4 @@ jobs:
- name: Check matrix job results
if: contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled')
run: |
echo "One or more matrix job tests failed or were cancelled. You may need to re-run them." && exit 1
echo "One or more matrix job tests failed or were cancelled. You may need to re-run them." && exit 1
2 changes: 1 addition & 1 deletion .github/workflows/test_local_destinations.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ jobs:
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/poetry.lock') }}-local-destinations

- name: Install dependencies
run: poetry install --no-interaction -E postgres -E duckdb -E parquet -E filesystem -E cli -E weaviate --with sentry-sdk --with pipeline
run: poetry install --no-interaction -E postgres -E duckdb -E parquet -E filesystem -E cli -E weaviate --with sentry-sdk,pipeline,notebook

- name: create secrets.toml
run: pwd && echo "$DLT_SECRETS_TOML" > tests/.dlt/secrets.toml
Expand Down
7 changes: 7 additions & 0 deletions dlt/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@
from dlt.pipeline import progress
from dlt import destinations

try:
from dlt.cli.magics import register_notebook_magics

AstrakhantsevaAA marked this conversation as resolved.
Show resolved Hide resolved
register_notebook_magics()
except Exception:
pass

pipeline = _pipeline
current = _current
mark = _mark
Expand Down
6 changes: 5 additions & 1 deletion dlt/cli/echo.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,13 @@ def note(msg: str) -> None:

def confirm(text: str, default: Optional[bool] = None) -> bool:
if ALWAYS_CHOOSE_VALUE:
warning(f"Automatically choosing {ALWAYS_CHOOSE_VALUE} for: {text}")
return bool(ALWAYS_CHOOSE_VALUE)
if ALWAYS_CHOOSE_DEFAULT:
assert default is not None
assert (
default is not None
), "Default value must be provided when ALWAYS_CHOOSE_DEFAULT is enabled."
warning(f"Automatically choosing default ({default}) value for: {text}")
return default
return click.confirm(text, default=default)

Expand Down
84 changes: 53 additions & 31 deletions dlt/cli/init_command.py
Original file line number Diff line number Diff line change
@@ -1,41 +1,42 @@
import os
import ast
import os
import shutil
from types import ModuleType
from typing import Dict, List, Sequence, Tuple
from importlib.metadata import version as pkg_version

import dlt.reflection.names as n
from dlt.cli import echo as fmt, pipeline_files as files_ops, source_detection, utils
from dlt.cli.config_toml_writer import WritableConfigValue, write_values
from dlt.cli.exceptions import CliCommandException
from dlt.cli.pipeline_files import (
TVerifiedSourceFileEntry,
TVerifiedSourceFileIndex,
VerifiedSourceFiles,
)
from dlt.cli.requirements import SourceRequirements
from dlt.common import git
from dlt.common.configuration.paths import get_dlt_settings_dir, make_dlt_settings_path
from dlt.common.configuration.specs import known_sections
from dlt.common.configuration.paths import (
create_symlink_to_dlt,
get_dlt_settings_dir,
make_dlt_settings_path,
)
from dlt.common.configuration.providers import (
CONFIG_TOML,
SECRETS_TOML,
ConfigTomlProvider,
SecretsTomlProvider,
)
from dlt.common.pipeline import get_dlt_repos_dir
from dlt.common.source import _SOURCES
from dlt.version import DLT_PKG_NAME, __version__
from dlt.common.configuration.specs import known_sections
from dlt.common.destination import Destination
from dlt.common.pipeline import get_dlt_repos_dir
from dlt.common.reflection.utils import rewrite_python_script
from dlt.common.schema.utils import is_valid_schema_name
from dlt.common.runtime.exec_info import is_notebook
from dlt.common.schema.exceptions import InvalidSchemaName
from dlt.common.schema.utils import is_valid_schema_name
from dlt.common.source import _SOURCES
from dlt.common.storages.file_storage import FileStorage

import dlt.reflection.names as n
from dlt.reflection.script_inspector import inspect_pipeline_script, load_script_module

from dlt.cli import echo as fmt, pipeline_files as files_ops, source_detection
from dlt.cli import utils
from dlt.cli.config_toml_writer import WritableConfigValue, write_values
from dlt.cli.pipeline_files import (
VerifiedSourceFiles,
TVerifiedSourceFileEntry,
TVerifiedSourceFileIndex,
)
from dlt.cli.exceptions import CliCommandException
from dlt.cli.requirements import SourceRequirements
from dlt.version import DLT_PKG_NAME

DLT_INIT_DOCS_URL = "https://dlthub.com/docs/reference/command-line-interface#dlt-init"
DEFAULT_VERIFIED_SOURCES_REPO = "https://github.com/dlt-hub/verified-sources.git"
Expand Down Expand Up @@ -92,7 +93,10 @@ def _select_source_files(
prompt = (
"Should incoming changes be Skipped, Applied (local changes will be lost) or Merged (%s"
" UPDATED | %s DELETED | all local changes remain)?"
% (fmt.bold(",".join(can_update_files)), fmt.bold(",".join(can_delete_files)))
% (
fmt.bold(",".join(can_update_files)),
fmt.bold(",".join(can_delete_files)),
)
)
choices = "sam"
else:
Expand Down Expand Up @@ -171,7 +175,10 @@ def _welcome_message(
if is_new_source:
fmt.echo(
"* Add credentials for %s and other secrets in %s"
% (fmt.bold(destination_type), fmt.bold(make_dlt_settings_path(SECRETS_TOML)))
% (
fmt.bold(destination_type),
fmt.bold(make_dlt_settings_path(SECRETS_TOML)),
)
)

if dependency_system:
Expand All @@ -191,7 +198,7 @@ def _welcome_message(
)
elif dependency_system == utils.PYPROJECT_TOML:
fmt.echo(" If you are using poetry you may issue the following command:")
fmt.echo(fmt.bold(" poetry add %s -E %s" % (DLT_PKG_NAME, destination_type)))
fmt.echo(fmt.bold(f" poetry add {DLT_PKG_NAME} -E {destination_type}"))
fmt.echo()
else:
fmt.echo(
Expand All @@ -216,7 +223,7 @@ def list_verified_sources_command(repo_location: str, branch: str = None) -> Non
for source_name, source_files in _list_verified_sources(repo_location, branch).items():
reqs = source_files.requirements
dlt_req_string = str(reqs.dlt_requirement_base)
msg = "%s: %s" % (fmt.bold(source_name), source_files.doc)
msg = f"{fmt.bold(source_name)}: {source_files.doc}"
if not reqs.is_installed_dlt_compatible():
msg += fmt.warning_style(" [needs update: %s]" % (dlt_req_string))
fmt.echo(msg)
Expand Down Expand Up @@ -279,7 +286,11 @@ def init_command(
if conflict_modified or conflict_deleted:
# select source files that can be copied/updated
_, remote_modified, remote_deleted = _select_source_files(
source_name, remote_modified, remote_deleted, conflict_modified, conflict_deleted
source_name,
remote_modified,
remote_deleted,
conflict_modified,
conflict_deleted,
)
if not remote_deleted and not remote_modified:
fmt.echo("No files to update, exiting")
Expand Down Expand Up @@ -327,7 +338,8 @@ def init_command(
)
fmt.warning(msg)
if not fmt.confirm(
"Would you like to continue anyway? (you can update dlt after this step)", default=True
"Would you like to continue anyway? (you can update dlt after this step)",
default=True,
):
fmt.echo(
"You can update dlt with: pip3 install -U"
Expand Down Expand Up @@ -387,9 +399,11 @@ def init_command(
)
# template sources are always in module starting with "pipeline"
# for templates, place config and secrets into top level section
required_secrets, required_config, checked_sources = source_detection.detect_source_configs(
_SOURCES, "pipeline", ()
)
(
required_secrets,
required_config,
checked_sources,
) = source_detection.detect_source_configs(_SOURCES, "pipeline", ())
# template has a strict rules where sources are placed
for source_q_name, source_config in checked_sources.items():
if source_q_name not in visitor.known_sources_resources:
Expand All @@ -410,7 +424,11 @@ def init_command(
)
# pipeline sources are in module with name starting from {pipeline_name}
# for verified pipelines place in the specific source section
required_secrets, required_config, checked_sources = source_detection.detect_source_configs(
(
required_secrets,
required_config,
checked_sources,
) = source_detection.detect_source_configs(
_SOURCES, source_name, (known_sections.SOURCES, source_name)
)

Expand Down Expand Up @@ -450,6 +468,7 @@ def init_command(
)
if use_generic_template:
fmt.warning("--generic parameter is meaningless if verified source is found")

if not fmt.confirm("Do you want to proceed?", default=True):
raise CliCommandException("init", "Aborted")

Expand Down Expand Up @@ -502,3 +521,6 @@ def init_command(
if dependency_system is None:
requirements_txt = "\n".join(source_files.requirements.compiled())
dest_storage.save(utils.REQUIREMENTS_TXT, requirements_txt)

if is_notebook():
create_symlink_to_dlt("_dlt")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have this problem (".dlt" not visible) on Jupyer and databrick or just on Colab? I'm aware of Colab so please check that and if I'm right, create it only for Colab, not for all envs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you take a look at the above? do you see the .dlt for example in Databricks notebook?