More activity types, and open sourcing the ML classifiers #34

sobri909 · 2018-07-29T05:39:14Z

It's too early to start on this one yet, but implementation ideas have been popping up in my head all week, so I might as well get them written down.

Misc random thoughts

Ditch the coords pseudo counts for transport types that are coordinate bound (eg car, bus, train, and any future road/track/etc bound types).
Only include types in the transport classifier that have non zero sample/coord counts for the D2 region, to avoid over stuffing the classifiers with possible types.
Look at ditching some model features, due to possibly being a negative influence. coreMotionActivityType is the most likely candidate for ditching, due to being wrong more often than right. Next would be courseVariance, due to being essentially meaningless for all types except stationary, since the introduction of the Kalmans.
Try letting zero step Hz back in for walking. There's certainly enough data for the models now, and phones are newer and smarter, so there's less risk of false zeroes, and more data to cope gracefully with those cases now too.
Look at deepening / broadening the composite classifiers tree. Could consider clustering all "steps" based types in a subtree (walking, running, cycling, horse riding, maybe skateboarding, others). Consider clustering the road bound types (car, bus, tram?). And there's the open question of whether a third depth would make sense in some cases, for when there's too many types to be sensibly clustered in a single classifier.

Need to flesh out the storage model for user custom types. And for custom types that have the potential to being upgraded to shared types. Things like naming specific train lines, naming specific roads, etc. These might initially come in as custom user types, but there needs to be a privacy conscious, opt-in path to upgrading these to shared types, if it makes sense to do so.

I'll come back to this and add more thoughts over the next week or two. There's a bunch more details that have already crossed my mind, but I'm not remembering them right now...

The text was updated successfully, but these errors were encountered:

sobri909 · 2018-08-11T10:45:53Z

Okay, I want to start on this now.

I'll initially be doing it in several smaller stages, so that the small classifier changes can be tested individually in the wild, for improvements / regressions in accuracy results.

Stage 1 will be items 1 and 2 from the notes list. So the stage 1 goal will be classifier feature / weighting tweaks, to hopefully eliminate the bogus transport type results that've been turning up in locations where they obviously shouldn't.

I'll also probably throw in items 3 and 4, because life is short and people want things now, not later. So it's more important to get results quickly rather than achieving an exact mathematical science.

I'll also probably throw in "tram" as a new transport type. Because literally hundreds of people have asked for it, and adding in one more base type will be an interesting first test for whether the other changes have been beneficial or not.

Oh, I guess I'm supposed to be open sourcing the classifier code too. Hmm. I don't have a solid plan for that yet. So I'll eye the code up while I'm going along, and if there's anything base level that I can cleanly open source along the way, I'll do it then. Basically mix in the open sourcing process along with the classifier improvements.

Alright ... time to get this happening.

sobri909 · 2018-08-11T10:59:27Z

…more often wrong than right #34

…seful for much, but it costs battery #34

sobri909 · 2018-08-22T11:10:37Z

Remove the special case rejection of zero stepHz values for walking (Maybe? As above, number hunt first)

I gave this one a couple of days on my test devices, and it was a disaster.

The pedometer's data is far too gappy to make it feasible. A large portion of walking samples are recorded with zero cadence and zero step counts. Which results in the walking model getting bloated with zero stepHz values, and the classifier treating a large percentage of vaguely moving indoor samples as walking (eg playing with your phone while lying on the couch). If the building produces drifting location data, then the percentage of false positives goes up significantly. It's a mess.

So that's a failed experiment, and definitely not going to ship!

Stop using coreMotionActivityType [Live in Arc App v2.1.9]

This one is on the fence. It'll need more time and data before it becomes clear whether it's helped or hindered.

I haven't been able to observe any positive difference yet, and there has possibly been some slight negative trend. But the activity types that would benefit from this most are types that are going through a rapid geographical coverage expansion at the moment, so they're going through the usual expected downward slope in reported accuracy, before starting to trend upwards.

I'll give the coreMotionActivityType change another week or two before deciding.

…ate #34

…bound) #34

…otherwise disaster #34

dolmens · 2018-12-30T04:49:56Z

When will you please open sourcing the ML classifiers?

sobri909 · 2018-12-30T05:46:15Z

Hi @dolmens! The ML classifiers are already open sourced. Have a look in the develop branch under Timelines/ActivityTypes.

https://github.com/sobri909/LocoKit/tree/develop/LocoKit/Timelines/ActivityTypes

dolmens · 2018-12-30T14:25:09Z

Fantastic. Thank you for your excellent work. Read that code later.

sobri909 added the enhancement label Jul 29, 2018

sobri909 self-assigned this Jul 29, 2018

sobri909 added a commit that referenced this issue Aug 15, 2018

moved activity type classifiers to LocoKit from LocoKitCore #34

bdf5660

sobri909 added a commit that referenced this issue Aug 15, 2018

stop using Core Motion activityType as a classifier feature, because …

4b6f6f7

…more often wrong than right #34

sobri909 added a commit that referenced this issue Aug 15, 2018

default to not recording Core Motion activityType, because it's not u…

6d9760d

…seful for much, but it costs battery #34

sobri909 added a commit that referenced this issue Aug 18, 2018

stop discarding zero stepHz values for walking models #34

90773a0

sobri909 added a commit that referenced this issue Aug 18, 2018

remove dead code #34

2af0ec1

sobri909 added a commit that referenced this issue Aug 22, 2018

drop coord matrix pseudo counts for road/track bound activity types #34

83838be

sobri909 added a commit that referenced this issue Aug 22, 2018

go back to discarding zero stepHz values for walking models #34

2d4e1b7

sobri909 added a commit that referenced this issue Aug 24, 2018

added tram #34

11e6150

sobri909 mentioned this issue Sep 11, 2018

Dropping support for non-persistent TimelineStore #41

Open

2 tasks

sobri909 added a commit that referenced this issue Sep 19, 2018

removed transport meta type, and collapsed to single classifier #34

71bdff3

sobri909 mentioned this issue Sep 20, 2018

Start using a "jobs queue" (ie an OperationQueue) for processing tasks #43

Closed

sobri909 added a commit that referenced this issue Sep 20, 2018

hardcode a LocoKit build number, and use it for model versions on upd…

807a3e7

…ate #34

sobri909 added a commit that referenced this issue Sep 20, 2018

add nine more activity types #34

07b9ae2

sobri909 changed the title ~~More activity types, custom types, and open sourcing the ML classifiers~~ More activity types, and open sourcing the ML classifiers Sep 21, 2018

sobri909 added a commit that referenced this issue Sep 22, 2018

if not D2, only return base types (all extended types are coordinate …

bc63ec3

…bound) #34

sobri909 added a commit that referenced this issue Sep 22, 2018

rawValue of activity type names can't contain spaces or punctuation, …

738662a

…otherwise disaster #34

sobri909 closed this as completed Nov 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More activity types, and open sourcing the ML classifiers #34

More activity types, and open sourcing the ML classifiers #34

sobri909 commented Jul 29, 2018 •

edited

Loading

sobri909 commented Aug 11, 2018

sobri909 commented Aug 11, 2018 •

edited

Loading

sobri909 commented Aug 22, 2018

dolmens commented Dec 30, 2018

sobri909 commented Dec 30, 2018

dolmens commented Dec 30, 2018

More activity types, and open sourcing the ML classifiers #34

More activity types, and open sourcing the ML classifiers #34

Comments

sobri909 commented Jul 29, 2018 • edited Loading

Misc random thoughts

sobri909 commented Aug 11, 2018

sobri909 commented Aug 11, 2018 • edited Loading

Stage 1 Todos

Stage 2 Todos

sobri909 commented Aug 22, 2018

dolmens commented Dec 30, 2018

sobri909 commented Dec 30, 2018

dolmens commented Dec 30, 2018

sobri909 commented Jul 29, 2018 •

edited

Loading

sobri909 commented Aug 11, 2018 •

edited

Loading