-
Notifications
You must be signed in to change notification settings - Fork 249
Release Info
Did you get here from the release notification and just want to know how to upgrade?
Follow the quickstart instructions to create
a new instance using the datalab create
command.
Then, in your previous Datalab instance, run the following code cell to copy your data to the new instance (change 'instance_name' to be the name of your new instance):
%%bash
gcloud compute copy-files \
/content/datalab \
datalab@instance_name:/mnt/disks/datalab-pd/datalab
Once you have confirmed that all of your data has been successfully copied to the new instance, you can shut down your old instance.
Note that with the GA release we no longer support running with a kernel-gateway as a backend.
If you created a Datalab instance using the datalab
tool, then run the following commands to recreate your VM with the newer image (while keeping the same persistent disk):
datalab delete --keep-disk ${VM-Name}
datalab create ${VM-Name}
Details about what has changed in each release are tracked the GitHub releases page.
Releases prior to August 2017 were not tracked as GitHub releases. Instead, their details are below:
Please check the release notes on the releases page
Please check the release notes on the releases page
Please check the release notes on the releases page
New Features:
- Datalab CLI improvements. Other Pull Requests: https://github.com/googledatalab/datalab/pull/ + #1167,#1171,#1173,#1175,#1178,#1168,#1183,#1187,#1188,#1193,#1194,#1196,#1198,#1202,#1212,#1216
- TensorFlow version Upgrade to 1.0
Bug fixes:
- Fix project_id from
gcloud config
in py3 - Use http Keep-Alive, else BigQuery queries are ~seconds slower than necessary
- Replace 127.0.0.1's with localhost to support IPv6-only machines
This is a new, non-default release with breaking changes. This will evolve into GA in the coming weeks
How to get it:
-
Prerequisite: a client machine with gcloud installed and up-to-date. A cloud project with billing and GCE API enabled. See below for Google Cloud Shell.
-
Install [Datalab CLI] (https://cloud.google.com/datalab/docs/quickstarts/quickstart-cli)
gcloud components install datalab
-
Make sure that you have set project and zone in gcloud configuration. If not, you can set the values as follows:
gcloud config set project <project-id>
gcloud config set compute/zone <selected-zone>
How to use it:
Run the following datalab create
command to deploy Datalab to a GCE instance in your cloud project.
An explicit image name is used only for the preview. Without that, you will get the mainline beta version.
Charges for GCE VM will apply until you explicitly stop or delete the VM.
The VM instance name is the name you would like to see in Cloud Console. It must start with a lowercase letter. You can create as many instances with distinct names as you like. But each VM is meant for a single user.
-
If you are not using Google Cloud Shell, run the following command:
datalab create --image-name gcr.io/cloud-datalab/datalab:v1-pre-ga <vm-instance-name>
Then navigate to 'localhost:8081' to connect to the newly deployed Datalab instance.
-
If you are using Google Cloud Shell, run the following command:
datalab create --no-launch-browser --image-name gcr.io/cloud-datalab/datalab:v1-pre-ga <vm-instance-name>
Once the instance is successfully created, use the web-preview feature of Cloud Shell to open port 8081.
Caveats:
-
This is a non-mainline, alpha release for those who want to understand the direction of GA. Datalab deployed to GCE VM via CLI is going to be the primary and supported option. Those developing Datalab may use the local option but it will not be officially supported by Google Cloud Support.
-
TensorFlow version changed to 0.12 - contains breaking changes. Please excuse the "under construction" Machine Learning tutorials while update is in progress.
-
PyDatalab libraries are not yet at GA version and do not cover planned breaking changes to take them to a stable version. In particular, work for ensuring full support for standard SQL in BigQuery and for an up-level experience with CloudML is in progress.
New Features:
- CLI option to enable automatic backups
- Automatically detect if websockets need to be proxied
- Support configuring user git email address for ungit
- Automatic integration with Google Cloud Source Repositories
- Option to shutdown VM from Datalab UI
- Added test framework using travis
- Use VM zone and name instead of ID for backups
- Restart SSH tunnel if it dies
- Upgrade to ungit 1.1.0
Bug fixes:
- Remove duplicate mount command from startup script
- Pop cell mini toolbar upward if there's not enough space downward
- Dark theme fixes
- Fix a bug when Datalab won't start if metadata server is down
- Increase the size of the default VM boot disk
- Clean up backup tar ball to reclaim space
- Explicitly configure gcloud to use the test project in release validation tests
This release includes the changes from the previous two releases that were rolled back, along with fixes for the bugs that triggered those rollbacks.
The only visible changes you should see with this release are in the notebook extensions Datalab provides:
- Add the ability to set a default SQL dialect
- Fix for an issue when combining %%chart with UDF queries
Fully Rolledback: This release has been rolled back to the previous release (Beta update #12) due to an issue where the Datalab frontend could not communicate with kernel gateways.
That bug was reported and is being tracked here
Fully Rolledback: This release has been rolled back to the previous release due to an issue that caused Datalab to redirect to login and give a 404. We will try another release once the root cause has been found and fixed.
The tracking bug for the issue is here
New Features:
- Added ungit integration inside Datalab
- Enabled ipywdigets
- Reduced docker image size
- Added MathJax by default to all notebooks
- Added a dropdown menu for code cells to collapse, hide code, run and clear output
- Support running Datalab from a custom root directory
Bug fixes:
- Fixed a bug where Datalab wouldn't start if Github.com is unreachable or there's no Internet connection
- Clear the complete marks from code cells when kernel is restarted
- Fail the build script when one build step exits with non-zero
- Fixed navigation links
Fully Rolledback: This release has been rolled back to the previous release due to this issue. We will try another release once the root cause has been found and fixed.
New Features:
- Enabled ipywdigets
- Reduced docker image size
- Added MathJax by default to all notebooks
- Added a dropdown menu for code cells to collapse, hide code, run and clear output
Bug fixes:
- Fixed a bug where Datalab wouldn't start if Github.com is unreachable or there's no Internet connection
- Clear the complete marks from code cells when kernel is restarted
- Fail the build script when one build step exits with non-zero
- Fixed navigation links
Bug fixes:
- Removed references to keyboard shortcuts that were no longer supported
- Pagination issue with the output of %%sql cells
- The command palette was too hard to read in the dark theme
- Input text was too hard to read in the dark theme
- Added a link to the API docs
- There was a bug in the passing around of the 'delimiter' parameter for CSV handling
- Performance fix for loading data into a paged table
New features:
- Datalab will remember the last directory the user opened and use that as the default the next time it is opened.
Bug fixes:
- The file name was not being displayed in the header bar
- The link to the new version is hard to read in the dark theme
- Datalab commands are now registered as both line and cell magic
- The new BigQuery types are now supported
New features:
- Updated dataflow to version 0.4.2
New features:
- ML Alpha Functionality
New features:
- Cleaned up main tool bar with:
- An easier to find "Sign In" button.
- An indication of the current signed-in account.
- The ability to select different themes.
- Support a web-based OAuth flow for running with a backend in Google Compute Engine.
- Turned on Jupyter's notebook notary feature.
- Reduced the amount of command line output for the
docker run
command. - Assorted CSS fixes to make the UI more consistent.
- Removed an unnecessary level of nesting from the docs directory.
New features:
- Greater flexibility and control over how Datalab runs. You can now run it on your own machine, and can (optionally) connect to a backend running on the Google Cloud Platform. You also now have full control over how your notebooks are managed. This version is a single user-version; i.e. each team member can run on their own machine or use their own VM. For more details, see here
Please note that the older, AppEngine Flex (formerly Managed VM) based versions of Datalab will soon be deprecated. So we urge you to upgrade to this version as soon as possible.
New features:
- Fixed an issue where the system could become stuck for a specific user if the file system operations to create the local copy of their workspace failed (#904).
New features:
- TensorFlow updated to version 0.7.1
- gcloud updated to version 106.0.0
- Fix for issue #884, where requests might be routed to the wrong Jupyter server.
- Fixed an issue where deployments would fail if the project name contained the word "project" (#858)
- Fixed another deployment issue when the 'datalab' branch has deleted (#788)
New features:
- TensorFlow included in Datalab container
- New sample: Machine Learning with TensorFlow and financial data. Existing deployments are currently not automatically updated but you can copy the sample from https://github.com/GoogleCloudPlatform/datalab/tree/master/content/datalab/samples
- Simple controls for charts
- Schema inference for nested tables
New features:
- BigQuery UDF support - Use JavaScript user-defined functions within SQL queries
- BigQuery federated data sources support - Use SQL to directly query structured data stored in Google Cloud Storage
Updated Sample Notebooks:
- New sample notebooks have been added and existing ones have been updated.
- Instructions to update your copy of samples are outlined in datalab/readme.ipynb, but in short: Please open the readme notebook, run it, and then commit the resulting changes to your repository.
Bug fixes:
- Security fix to prevent users with “viewer” permission from accessing Datalab. Cloud project members within the 'Viewer' group are no longer authorized. Only 'Editors' and 'Owners' are. All users will need to launch Datalab via https://datalab.cloud.google.com to access their Datalab instance at least once after updating, so their role can be validated.
- Issues related to lost data due to notebook sync issues have been addressed Commits that sometimes were attributed to an unknown user are now properly attributed to the signed in user.
- Notebooks saved as html will now show charts (charts may not be visible on github due to an underlying Google Charts issue)
Breaking change:
In order to accommodate more options for BigQuery federated data sources, the following API was changed. See datalab/tutorials/BigQuery/Importing and Exporting Data.ipynb for an example
- Old:
Table.load(source, mode='create', source_format='csv',
csv_delimiter=',', csv_skip_header_rows=0, encoding='utf-8',
quote='"', allow_quoted_newlines=False, allow_jagged_rows=False,
ignore_unknown_values=False, max_bad_records=0)
- New:
options = CSVOptions(delimiter=',', skip_leading_rows=0, encoding='utf-8',
quote='"', allow_quoted_newlines=False, allow_jagged_rows=False)
Table.load(source, mode='create', source_format='csv', csv_options=options,
ignore_unknown_values=False, max_bad_records=0)
- First public release of Cloud Datalab (beta)
- Main features: Support for using SQL with BigQuery and Python for interactive data exploration, visualization and analysis.