-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import fixes #161
Import fixes #161
Conversation
@@ -990,7 +990,7 @@ def generate_automated_labels_microfaune( | |||
if annotations.empty: | |||
annotations = new_entry | |||
else: | |||
annotations = annotations.append(new_entry) | |||
annotations = pd.concat([annotations, new_entry]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``So for concat, its a bit slower if we append each row to a dataframe. According to the docs its slightly better to instead save each new_entry to a list then concat the list.
So basically
list = []
for i in etc:
list.append(some new row for a future df)
df = pd.concat(list)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. TBH I was just doing a find and replace for this PR, didn't fully look at all the context. I'll go back and make the necessary changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments. Make sure to make a list of rows then concat so we don't call concat over and over again in each for loop. According to https://pandas.pydata.org/docs/reference/api/pandas.concat.html its better practice
|
||
# Open file with librosa (uses ffmpeg or libav) | ||
print("Path: ", path) | ||
# Open file with librosa (uses ffmanaeg or libav) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change ffmanaeg back to ffmpeg
@@ -263,25 +263,25 @@ def automated_labeling_statistics( | |||
if statistics_df.empty: | |||
statistics_df = clip_stats_df | |||
else: | |||
statistics_df = statistics_df.append(clip_stats_df) | |||
statistics_df = pd.concat([statistics_df,clip_stats_df]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note about concatenation here in pandas
continue | ||
if num_processed % 50 == 0: | ||
print("Processed", num_processed, "clips in", int((time.time() - start_time) * 10) / 10.0, 'seconds') | ||
start_time = time.time() | ||
if num_errors > 0: | ||
checkVerbose("Something went wrong with" + num_errors + "clips out of" + str(len(clips)) + "clips", verbose) | ||
checkVerbose(f"Something went wrong with {num_errors} clips out of {len(clips)} clips", verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good change!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to remove pyha tutorial from PR before submit
Added resampy to toml and lock files
Corrected pd.df.append to pd.concat
other minor fixes