Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add audio_to_audio_batch streamlit page #75

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

prescience-data
Copy link

@prescience-data prescience-data commented Jan 8, 2023

summary:

  • allows users to generate batches of samples from an original audio segment
  • attempts to follow conventions and flow from audio_to_audio page

features:

  • user uploads original source audio file which is sliced to a length specified in first options set
  • user can preview slice audio and spectrogram using dropdown
  • user specifies standard stablediffusion inputs and an integer for batch size
  • on submit loops for the specified number of batches, using a random seed each loop
  • batch outputs to a container displaying the seed used, riffed output, and differential

misc:

  • adds gitignore line for jetbrains ide

todo (future):

  • work out how to have audio filenames reflect a hash of the prompt + seed number for easier reference

summary:
- allows users to generate batches of samples from an original audio segment
- attempts to follow conventions and flow from audio_to_audio page

features:
- user uploads original source audio file which is sliced to a length specified in first options set
- user can preview slice audio and spectrogram using dropdown
- user specifies standard stablediffusion inputs and an integer for batch size
- on submit loops for the specified number of batches, using a random seed each loop
- batch outputs to a container displaying the seed used, riffed output, and differential

misc:
- adds gitignore line for jetbrains ide
@@ -9,6 +9,9 @@ __pycache__/
# VSCode
.vscode

# Jetbrains IDEA
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed other IDEs were ignored, so added this line for JetBrains

closest_height = int(np.ceil(init_image.height / 32) * 32)
init_image_resized = init_image.resize((closest_width, closest_height), Image.BICUBIC)

st.write("#### Source Clip")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows users to preview the slice they have configured in first options set. However as the spectrogram is quite large vertically, have added this in an optional expander.

if not submit_button:
return

for b in range(0, batches):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primary batch loop. Generates a new random seed each loop.

)

# Resize back to original size
result_image = result_image.resize(init_image.size, Image.BICUBIC)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not attempt stitching as with audio2audio. The idea is that batching is more "experimental" to generate many short samples to assist dialling in prompts and finding good seed values, but then moving over to audio2audio for any extended track generation.

riffed_segment.export(audio_bytes, format="wav")
left.audio(audio_bytes)

right.write(f"##### Differential")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also shows the differential in second column to quickly see how much of a shift has been generated per seed.

- for windows hosts, some dependencies may need to be manually installed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant