Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filters zhaw #327

Merged
merged 20 commits into from
Jul 27, 2023
Merged

Filters zhaw #327

merged 20 commits into from
Jul 27, 2023

Conversation

krumjan
Copy link
Contributor

@krumjan krumjan commented Feb 28, 2023

Added four functions for trajectory filtering/preprocessing (median_filter, deriv_filter, cluster_filter, smoothing), as well as a fifth function (filter_zhaw) which applies the other four functions with varying parameters adjusted to the different flight-parameters.

For trajectories of good quality, the current traffic filter() function and filter_zhaw() appear to produce similar results. For trajectories of poor quality, however, the new approach often seems to perform better. See below two examples for trajectories of rather poor quality (blue = unfiltered, green = traffic filter, orange = filter_zhaw)

filtered
filtered_4

@xoolive
Copy link
Owner

xoolive commented Feb 28, 2023

Thank you Jan,

Could you try to rebase your code and master in order to fix the conflicts? (it's basically because the traffic folder sits now in src/...)

Few preliminary comments:

  • it would be nice to have some of the samples you use to make your plots (good and bad ones). I suspect you don't interpolate after filtering on your green plot... These would be helpful for documentation and testing.

  • could we rename functions as filter_* rather than *_filter? also allow the functions to take a list of features rather than individual names (will simplify filter_zhaw in the end; note we will probably change the name, or merge than into a filter generic function later)

  • is filter_median the same as the current filter() function?

@krumjan
Copy link
Contributor Author

krumjan commented Mar 3, 2023

Hi Xavier,

I hope the rebasing I attempted fixed the conflicts. In commit f9e9fee, I renamed all functions to filter_*. Additionally, after the changes in commit bd08bcc, all functions now accept lists as inputs instead of individual column names and parameter values.

As far as I understand, there are some differences between the current filter() function and the new filter_median(). filter_median() replaces all the values with the median of the sliding window, while filter() replaces unacceptable values (based on a computed threshold) with NaN before filling them with a strategy such as forward/backward fill.
In the specific case of applying it in combination with the other newly introduced filter steps, such as in filter_zhaw(), I achieved better results with filter_median() than with filter()

Where do you suggest putting the sample trajectories? Maybe "/src/traffic/data/samples"?

@xoolive
Copy link
Owner

xoolive commented Mar 3, 2023

Hi @krumjan
Maybe just drop the parquet files here and I'll handle it from there (as soon as I can 😓)

@krumjan
Copy link
Contributor Author

krumjan commented Mar 3, 2023

In the following zip, you find two parquet files. "trajs_good.parquet" which contains some trajectories of rather good quality (if anything they only contain a few spikes) and "trajs_bad.parquet" which contains rather noisy trajectories. ;)

trajs.zip

:param kernel_size: The size of the kernel to use for the rolling mean.
"""
for paracol, kernel_size in zip(paracols, kernel_sizes):
data = self.data.reset_index(drop=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this line be outside the loop?
Also, why not name this one filter_mean? (the only difference with filter_median is the mean method, right?)

for paracol, th1, th2, window in zip(paracols, th1s, th2s, windows):
diff1 = abs(data[paracol].diff())
if paracol == "track":
diff1.loc[(diff1 < 370) & (diff1 > 350)] = 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to work on track_unwrapped ?

@xoolive
Copy link
Owner

xoolive commented Jul 20, 2023

Thank you Jan for the pull request

I did some major changes in the way we do filtering, and would love to have you check that. Basically:

  • the Flight.filter() function takes a Filter class in parameter, and most filters are defined in algorithms/filters.py. The plan is to add more advanced filters in the future, such as Kalman Filter, EKF, etc.
  • Filters can be piped either by applying filter() many times, or by creating a filter object with the | (pipe) operator.
  • For the moment, your ZHAW filter implementation is commented in filters.py. We can provide it as here (with a better name) or have users chain their own filters if they prefer.
  • For compatibility purposes, the old syntax of filter() still works, and a FilterAboveSigmaMedian class is created with the kwargs argument (new way of doing things)

Once you have a look, I will start working on tests.

It would be nice if you could also check the documentation of your classes and write what you like there. I copied and pasted things, but you may prefer to put things differently. (It appears in api_reference/traffic.algorithms.filters.html)

I may keep illustrated examples from this PR for a specific user guide on filtering (https://traffic-viz.github.io/user_guide/processing.html)

@xoolive xoolive merged commit 5f37b41 into xoolive:master Jul 27, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants