Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
lib
 
 
 
 
 
 
web
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

MachineLearningHoods

Uses AWS Machine Learning to predict which neighborhood a string of text most likely originates from.

How

  • Using a dataset of ~1G of geo-tagged tweets, we create a CSV with two columns: text and neighborhood.
  • After training and evaluating a model with this data, we expose the real-time endpoint via this elixir application.

Takeaways

  • Molding the training data to create a better model is the real challenge here.
  • Does my data even have statistical correlations or is it just noise?
  • Iterate, iterate, iterate on the model and evaluation data is what it seems people do.

Prediction Matrix Neighborhood Categories

Input Schema

{
  "version": "1.0",
    "rowId": null,
    "rowWeight": null,
    "targetAttributeName": "Neighborhood",
    "dataFormat": "CSV",
    "dataFileContainsHeader": true,
    "attributes": [
    {
      "attributeName": "Text",
      "attributeType": "TEXT"
    },
    {
      "attributeName": "Neighborhood",
      "attributeType": "CATEGORICAL"
    }
    ],
    "excludedAttributeNames": []
}

About

http://blog.dimroc.com/2016/01/13/machine-learning-neighborhoods/ Run datasets through AWS Machine Learning to train a model that can tell what neighborhood a comment belongs to

Resources

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.