New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Infra for Training #124
base: main
Are you sure you want to change the base?
Conversation
…ng-fork. implemented and tested a dataset only arg version of download.py. Need to reconsider the other arguments.
…to test the ecosystem. Tests are written but tox is currently not passing. Many files reformatted by black.
…project.toml and tox.yml, which needs review. Added .jams files to MANIFEST.in to accomodate test file for guitarset. Added check for pytest run to guitarset download to avoid unnecessary full download during testing, need review. tox passing.
…roject.toml. Moved all tests relating to data into their own test directory. Updated path to resources accordingly.
… test for tf_example_serialization, added / corrected tests for other data / train files.
…sted, add .zip files to manifest.in
…tall sox.portable
CONTRIBUTING.md
Outdated
@@ -13,6 +13,7 @@ We recommend first installing the following non-python dependencies: | |||
- To install on Windows, run `choco install libsndfile` using [Chocolatey](https://chocolatey.org/) | |||
- To install on Ubuntu, run `sudo apt-get update && sudo apt-get install --no-install-recommends -y --fix-missing pkg-config libsndfile1` | |||
- [ffmpeg](https://ffmpeg.org/) is a complete, cross-platform solution to record, convert and stream audio in all `basic-pitch` supported formats | |||
- sox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a description and mention that it's required for data creation?
"disk_size_gb": 128, | ||
"experiments": ["use_runner_v2"], | ||
"save_main_session": True, | ||
"worker_harness_container_image": known_args.worker_harness_container_image, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to run a DirectRunner test with a harness container image? We may need to provide a Dockerfile or instructions on how to create one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's discuss in meeting
@@ -0,0 +1,41 @@ | |||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add copyright notice
#!/usr/bin/env python | ||
# encoding: utf-8 | ||
# | ||
# Copyright 2022 Spotify AB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of copy right notices, you probably should update the year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in all the files?
basic_pitch/note_creation.py
Outdated
@@ -110,7 +110,11 @@ def model_output_to_notes( | |||
) | |||
|
|||
|
|||
def sonify_midi(midi: pretty_midi.PrettyMIDI, save_path: Union[pathlib.Path, str], sr: Optional[int] = 44100) -> None: | |||
def sonify_midi( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh why is there a ton of formatting changes? Do you use a different max line length than the one specified in pyproject.toml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but I did run the formatter via tox. I can switch it all back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new ones at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched it all back, but after running tox it does seem that black wants to reformat two long lines into multi line blocks even though they don't exceed the line limit. Wonder if it's some other rule
pyproject.toml
Outdated
@@ -19,6 +19,7 @@ classifiers = [ | |||
] | |||
dependencies = [ | |||
"coremltools; platform_system == 'Darwin'", | |||
"apache_beam", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be in the general project deps. This should be in extras only.
… sox in CONTRIBUTING.md, add copyright notice to download.py, remove apache_beam from general dependencies.
started a new branch because there was some mismatch with the previous branch and remote, so this seemed easier.
Not infra as in backend, obv. but code structure and tests for datasets, with guitarset implemented as a first inclusion / example.