Skip to content

Commit

Permalink
Fix typo
Browse files Browse the repository at this point in the history
  • Loading branch information
Sdedelbrock committed Jun 13, 2019
1 parent 4b7e0e7 commit 60eec53
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/formatting_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ NOTE: We assume each column is a numerical column, unless you specify otherwise
#. ``attribute_name: 'categorical'`` All attribute names that hold a string in any of the rows after the header row will be encoded as categorical data. If, however, you have any numerical columns that you want encoded as categorical data, you can specify that here.
#. ``attribute_name: 'nlp'`` If any of your data is a text field that you'd like to run some Natural Language Processing on, specify that in the header row. Data stored in this attribute will be encoded using TF-IDF, along with some other feature engineering (count of some aggregations like total capital letters, puncutation characters, smiley faces, etc., as well as a sentiment prediction of that text).
#. ``attribute_name: 'ignore'`` This column of data will be ignored.
#. ``attribute_name: 'date'`` Since ML algorithms don't know how to handle a Python datetime object, we will perform feature engineering on this object, creating new features like day_of_week, or minutes_into_day, etc. Then the original date field will be removed from the training data so the algorithsm don't throw a TypeError.
#. ``attribute_name: 'date'`` Since ML algorithms don't know how to handle a Python datetime object, we will perform feature engineering on this object, creating new features like day_of_week, or minutes_into_day, etc. Then the original date field will be removed from the training data so the algorithms don't throw a TypeError.


Passing in your own feature engineering function
Expand Down

0 comments on commit 60eec53

Please sign in to comment.