Text Preprocessing
-
Updated
Nov 12, 2017 - Python
Text Preprocessing
Programs I write for my Data Mining course
My solution for Kaggle NYC Taxi Fare Prediction ( ranked 21st/1463)
Spark-lean, an interactive PySpark-based Data Cleaning Library
A package to split a complex text into simple sentences.
Python scripting challenge
Joining, Cleaning, Querying, Performing ETL on Twitter Posts Dataset.
An web based image classifier to classify cricketers such as 'AB de Villiers', 'Brian Lara', 'Rahul Dravid', 'Rohit Sharma', 'Sachin Tendulkar', 'Shane Warne', 'Virat Kolhi'
A efficient and optimized search engine
This sweet little program is to data-set as your soap is to your body. The end result will be clean, shiny, more beautiful. Check it out.
Data analysis and visualization of New York Yellow Taxi Trip data, The core objective of this is to find the most pickups, drop-offs of public based on their location, time of most traffic and how to overcome the needs of the public, by using BigData Technologies and Tableau.
python module for parsing PDF and scraping URLs
Resume Screening using Machine Learning and Python
Compilation of programming codes and projects made in different programming languages and technology domains.
Data Analysis / Data Science mini-project. The project includes exploring the three datasets, data cleaning, making data uniform, changing headers, changing index, merging data, and manipulation using Numpy and Pandas. More information in project_description.txt
Add a description, image, and links to the datacleaning topic page so that developers can more easily learn about it.
To associate your repository with the datacleaning topic, visit your repo's landing page and select "manage topics."