Wikidata and Wikipedia language data extraction
-
Updated
Jun 8, 2024 - Python
Wikidata and Wikipedia language data extraction
Client interface for all things Cleanlab Studio
The open-source tool for building high-quality datasets and computer vision models
A coding tool developed in R to take water analysis results exported from the ALS WEBTRIEVE™ data portal. Exported data are cleaned, merged, and exported into archiving (e.g., CSV) or visual (e.g., HTML) formats.
I will be sharing Data analysis and Data visualization projects here !
This project analyzes tumor cell data from 550 patients using Python. It involves data cleaning, exploratory analysis, feature engineering, and machine learning to classify tumors as malignant or benign. Techniques include PCA, logistic regression, and k-fold cross-validation to ensure model accuracy and reliability.
Prepping tables for machine learning
💻☕This repository is a resource for learning data science, including learning materials and projects. It covers topics such as data analysis, machine learning, and programming.
A comprehensive analysis in R of weak signal propagation within the WSPR network, focusing on the impact of distance, frequency, and power on signal-to-noise ratios. The project includes data cleaning, statistical analysis, and linear regression modeling to predict signal reception quality and understand the factors influencing signal propagation.
Standardise TR/MH data
Data cleaning and preprocessing before ML models application. EDA and visualizations for agriculture data in EU.
This repository contains the project work completed as part of the Accenture North America Data Analytics and Visualization Job Simulation on Forage in June 2024. The project focused on analyzing and visualizing data for a social media client to uncover insights and inform strategic decisions.
Easy to use Python library of customized functions for cleaning and analyzing data.
Objective: Analyze the impact of Airbnb on the rental and real estate markets in various Californian regions using data visualization techniques.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
🗺️ Data Cleaning and Textual Data Visualization 🗺️
Test data management tool for any data source, batch or real-time
A light-weight, flexible, and expressive statistical data testing library
Add a description, image, and links to the data-cleaning topic page so that developers can more easily learn about it.
To associate your repository with the data-cleaning topic, visit your repo's landing page and select "manage topics."