Skip to content

Conversation

@shmruin
Copy link
Contributor

@shmruin shmruin commented Jul 13, 2025

What is this PR for?

Currently, download link of bank.zip file in tutorial page is broken.

The root cause is a change in the URL for the UCI Machine Learning dataset. The previous link, http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip, is no longer valid.

The new dataset for the same ID (222) is now located at https://archive.ics.uci.edu/dataset/222/bank+marketing.

Additionally, bank.zip is no longer offered as a standalone file. It is now nested inside a bank+marketing.zip archive.

This PR updates the tutorial to:

  • Replace the broken link with the new, correct URL for the bank+marketing.zip file.
  • Add a clear instruction for users to first unzip the main bank+marketing.zip archive to find and use the required bank.zip file within it.

What type of PR is it?

Bug Fix

Todos

What is the Jira issue?

[ZEPPELIN-6199]

How should this be tested?

  • Run the fixed document locally with docker.
  • Download the file in this fixed page, and run related tutorials with this file in zeppelin notebook.

Screenshots (if appropriate)

  • Run tutorial with new bank.zip file.
Tutorial Test Result
  • Add a new instruction to use the data file.
Fixed Tutorial Page

Questions:

  • Does the license files need to update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

Copy link
Contributor

@tbonelee tbonelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The download link on the tutorial page works, and the zip file contains the necessary files.

@pan3793 pan3793 merged commit d8b2f0c into apache:master Jul 13, 2025
17 of 18 checks passed
pan3793 pushed a commit that referenced this pull request Jul 13, 2025
### What is this PR for?
Currently, download link of `bank.zip` file in [tutorial page](https://zeppelin.apache.org/docs/0.12.0/quickstart/tutorial.html) is broken.

The root cause is a change in the URL for the UCI Machine Learning dataset. The previous link, http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip, is no longer valid.

The new dataset for the same ID (222) is now located at https://archive.ics.uci.edu/dataset/222/bank+marketing.

Additionally, `bank.zip` is no longer offered as a standalone file. It is now nested inside a `bank+marketing.zip` archive.

This PR updates the tutorial to:
* Replace the broken link with the new, correct URL for the `bank+marketing.zip` file.
* Add a clear instruction for users to first unzip the main `bank+marketing.zip` archive to find and use the required `bank.zip` file within it.

### What type of PR is it?
Bug Fix

### Todos

### What is the Jira issue?
[ZEPPELIN-6199]

### How should this be tested?
* Run the fixed document locally with docker.
* Download the file in this fixed page, and run related tutorials with this file in zeppelin notebook.

### Screenshots (if appropriate)

* Run tutorial with new `bank.zip` file.
<img width="2491" height="1156" alt="Tutorial Test Result" src="https://github.com/user-attachments/assets/1018f2ae-8cd1-475c-9bdd-015b8cd4b362" />

* Add a new instruction to use the data file.
<img width="910" height="637" alt="Fixed Tutorial Page" src="https://github.com/user-attachments/assets/40703945-6cde-48b4-8060-c00d83278348" />

### Questions:
* Does the license files need to update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Closes #4966 from shmruin/ZEPPELIN-6199.

Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit d8b2f0c)
Signed-off-by: Cheng Pan <chengpan@apache.org>
@pan3793
Copy link
Member

pan3793 commented Jul 13, 2025

Thanks, merged to master/0.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants