Skip to content

Assignments

Awantik Das edited this page Mar 31, 2017 · 2 revisions
  1. Try with reduced baby names dataset first

  2. Get the country names for same baby names

  3. Country names should not be repeated

  4. For large datasets, use saveAsTestFile to store the contents

Spark Machine Learning Project

Allstate Claims Severity

When you’ve been devastated by a serious car accident, your focus is on the things that matter the most: family, friends, and other loved ones. Pushing paper with your insurance agent is the last place you want your time or mental energy spent. This is why Allstate, a personal insurer in the United States, is continually seeking fresh ideas to improve their claims service for the over 16 million households they protect.

  • Statement - creating an algorithm which accurately predicts claims severity.

  • About Data - Each row in this dataset represents an insurance claim. You must predict the value for the 'loss' column. Variables prefaced with 'cat' are categorical, while those prefaced with 'cont' are continuous.