Skip to content
http://blog.dimroc.com/2016/01/13/machine-learning-neighborhoods/ Run datasets through AWS Machine Learning to train a model that can tell what neighborhood a comment belongs to
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
config
lib
priv
test
web
.dockerignore
.envrc.example
.gitignore
Dockerfile
Makefile
README.md
brunch-config.js
docker-compose.yml
mix.exs
mix.lock
package.json

README.md

MachineLearningHoods

Uses AWS Machine Learning to predict which neighborhood a string of text most likely originates from.

How

  • Using a dataset of ~1G of geo-tagged tweets, we create a CSV with two columns: text and neighborhood.
  • After training and evaluating a model with this data, we expose the real-time endpoint via this elixir application.

Takeaways

  • Molding the training data to create a better model is the real challenge here.
  • Does my data even have statistical correlations or is it just noise?
  • Iterate, iterate, iterate on the model and evaluation data is what it seems people do.

Prediction Matrix Neighborhood Categories

Input Schema

{
  "version": "1.0",
    "rowId": null,
    "rowWeight": null,
    "targetAttributeName": "Neighborhood",
    "dataFormat": "CSV",
    "dataFileContainsHeader": true,
    "attributes": [
    {
      "attributeName": "Text",
      "attributeType": "TEXT"
    },
    {
      "attributeName": "Neighborhood",
      "attributeType": "CATEGORICAL"
    }
    ],
    "excludedAttributeNames": []
}
You can’t perform that action at this time.