Skip to content

Commit

Permalink
Fix FordA download url in classification notebook (#309)
Browse files Browse the repository at this point in the history
* fix ford_a download url

* fix url

* update changelog

---------

Co-authored-by: Egor Baturin <egoriyaa@github.com>
  • Loading branch information
egoriyaa and Egor Baturin committed May 13, 2024
1 parent 97515b9 commit 010fa43
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 13 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
-

### Fixed
-
- Fix FordA download url in classification notebook ([#309](https://github.com/etna-team/etna/pull/309))
-
-
-
Expand Down
30 changes: 18 additions & 12 deletions examples/305-classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,15 @@
"execution_count": 3,
"id": "c085ebe2",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[34m\u001b[1mwandb\u001b[0m: \u001b[33mWARNING\u001b[0m Disabling SSL verification. Connections to this server are not verified and may be insecure!\n"
]
}
],
"source": [
"import pathlib\n",
"\n",
Expand Down Expand Up @@ -90,9 +98,7 @@
"source": [
"### 1.1 Loading dataset <a class=\"anchor\" id=\"section_1_1\"></a>\n",
"\n",
"Consider the example `FordA` dataset from [UCR archive](https://www.cs.ucr.edu/~eamonn/time_series_data/). Dataset consists of engine noise measurements and the problem is to diagnose whether a certain symptom exists in the engine. The comprehensive description of `FordA` dataset can be found [here](http://www.timeseriesclassification.com/description.php?Dataset=FordA). \n",
"\n",
"It is possible to load the dataset using `fetch_ucr_dataset` function from [`pyts` library](https://pyts.readthedocs.io/en/stable/index.html), but let's do it manually."
"Consider the example `FordA` dataset from [UCR archive](https://www.cs.ucr.edu/~eamonn/time_series_data/). Dataset consists of engine noise measurements and the problem is to diagnose whether a certain symptom exists in the engine."
]
},
{
Expand All @@ -107,13 +113,13 @@
"text": [
" % Total % Received % Xferd Average Speed Time Time Time Current\n",
" Dload Upload Total Spent Left Speed\n",
"100 34.6M 100 34.6M 0 0 2585k 0 0:00:13 0:00:13 --:--:-- 2826k\n"
"100 301M 100 301M 0 0 4640k 0 0:01:06 0:01:06 --:--:-- 4085k33k 0 0:01:41 0:00:07 0:01:34 4195k 0 0:01:25 0:00:14 0:01:11 4251k 0 0:01:07 0:00:47 0:00:20 5043k\n"
]
}
],
"source": [
"!curl \"https://timeseriesclassification.com/aeon-toolkit/FordA.zip\" -o data/ford_a.zip\n",
"!unzip -q data/ford_a.zip -d data/ford_a"
"!curl https://www.cs.ucr.edu/~eamonn/time_series_data_2018/UCRArchive_2018.zip -o data/ucr_datasets.zip\n",
"!unzip -q -P someone -j data/ucr_datasets.zip 'UCRArchive_2018/FordA/*.tsv' -d data/"
]
},
{
Expand All @@ -123,9 +129,9 @@
"metadata": {},
"outputs": [],
"source": [
"def load_ford_a(path: pathlib.Path, dataset_name: str):\n",
" train_path = path / (dataset_name + \"_TRAIN.txt\")\n",
" test_path = path / (dataset_name + \"_TEST.txt\")\n",
"def load_ford_a(path: str):\n",
" train_path = path + \"_TRAIN.tsv\"\n",
" test_path = path + \"_TEST.tsv\"\n",
" data_train = np.genfromtxt(train_path)\n",
" data_test = np.genfromtxt(test_path)\n",
"\n",
Expand All @@ -145,14 +151,14 @@
"metadata": {},
"outputs": [],
"source": [
"X_train, X_test, y_train, y_test = load_ford_a(pathlib.Path(\"data\") / \"ford_a\", \"FordA\")\n",
"X_train, X_test, y_train, y_test = load_ford_a(\"data/FordA\")\n",
"y_train[y_train == -1], y_test[y_test == -1] = 0, 0 # transform labels to 0,1"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "c6f62d48",
"id": "fa1581fb",
"metadata": {},
"outputs": [
{
Expand Down

0 comments on commit 010fa43

Please sign in to comment.