Package for #nocode
machine learning. Supported in all #nocode
programming languages.
10 easy steps:
- Install this package
- Define business problem & objective
- Provide clean data
- Define offline validation approach
- Train the model (one-click)
- Validate the model offline
- Deploy the model (one-click)
- Validate the model online
- Share the results
- Maintain the model
1. Install this package via the command below. Then, click here.
2. Define your business problem and objective. Then, click here.
- Get buy-in from all stakeholders
involved, including their pet ๐ถ/๐ฑ/๐/๐ต. - (Note: There
mightwill be conflicting objectives. E.g., customer experience wants to remove counterfeit/low-quality products (to protect customers) but commercial refuses as they think it'll reduce revenue.) - It's okay if you don't have the problem defined. Let's train some ML first and figure it out later.
- It's okay if you don't have the objective defined. You can decide after viewing the A/B test results.
- (Optional) Decide how your ML model will benefit customers. Will it (i) be integrated into an existing system, (ii) need a new UI, (iii) augment decision-making, (iv) something else?
3. Provide your pristine data. Then, click here.
- Upload your data as a single denormalized
csv
; file size should not exceed 1gb. - Data should not have missing values. Decide whether to exclude at row or column level, impute via statistics (e.g., median, mode), machine learning, or a specified null value (e.g.,
NA
,-1
). - For string values:
ASCII
encoded, lowercased, spellchecked & normalized (see "60 ways to spell Philidelphia" below), naughty words removed. - For numerics: Parsed correctly (e.g.,
"$1.00"
,"USD1.00"
,"0.85 โฌ"
should all be1.0
), exclude errors (e.g., age > 200) and possibly outliers. - For date: Formatted based on ISO 8601.
- For human genes: Formatted based on industry best practice.
- (Optional) Remove redundant columns (e.g., only a single value, >95% missing values, low variance, etc.)
- (Optional) Remove redundant rows (e.g., exact duplicates, >95% missing values, etc.)
60 ways to spell "Philidelphia"
PHAILLIDELPPHA
PHIADELPHIA
PHIALDELPHIA
PHIDAELPHIA
PHIELADELPHIA
PHIILADELPHIA
PHILA
PHILA.
PHILAD
PHILADALPHIA
PHILADEDLPHIA
PHILADELAPHIA
PHILADELELPHIA
PHILADELHIA
PHILADELHIPHILADELHPIA
PHILADELHPIA
PHILADELOHIA
PHILADELPH
PHILADELPHA
PHILADELPHAI
PHILADELPHI
PHILADELPHIA
PHILADELPHIA PA
PHILADELPHIA,
PHILADELPHIA, PA
PHILADELPHIA.
PHILADELPHIAPHIA
PHILADELPHIOA
PHILADELPHIOE
PHILADELPIA
PHILADELPOHIA
PHILADELPPHIA
PHILADEPHA
PHILADEPHIA
PHILADEPHILA
PHILADEPLHIA
PHILADLEPHIA
PHILADPHIA
PHILAELPHIA
PHILDADELPHIA
PHILDADLPHIA
PHILDEALPHIA
PHILDEALPHIA
PHILDELPHIA
PHILDELPHILA
PHILDEPPHIA
PHILDRLPHIA
PHILEAPHIA
PHILIAHELPHIA
PHILIDELPHIA
PHILLA
PHILLADELPHIA
PHILLY
PHILOADELPHIA
PHLADELPHIA
PHOLADELPHIA
PHPILADELPHIA
PIHLADELPHIA
Suggested #nocode
tools:
- To query and join database tables to get the single
csv
, try these#nocode
tools:SQL - To clean tabular data, try:
pandas,Spark, Excel, Numbers, Sheets, OpenOffice Calc - To clean and augment images, try:
torchvision, Paint, Preview
4. Set (offline) validation approach and metrics. Then, click here.
- Decide how to split the data into train, validation, and test. (By default, random-split is used, though a time-based split should be used in most production settings.)
- Decide on metric(s). (By default, RMSE and accuracy are selected; pick whichever looks best after validation.)
- (Note: Upgrade to PRO edition and get 100+ metrics sorted in order of "What looks best").
5. Train the machine learning model via one-click.
- This is the easiest step of all; click on this above โ๏ธ
- The package will run all supervised, unsupervised, semi-supervised, self-supervised, reinforcement, transfer, ensemble, meta, few-shot, one-shot, blockchain learning models, starting with the most compute-intensive.
6. Validate the model offline. Then, click here.
- If you have done steps 2 and 4 properly, this will be straightforward.
- (Optional) Model analysis via learning curves, precision-recall trade-offs, residual analysis, etc.
- (Optional) Feature importance and selection. (By default, all features are selected even if only 1% are useful.)
- (Optional) Error analysis, examine counterfactuals and skewed classes (and adjust distribution).
7. Deploy the ML-model via one-click.
- This is also easy; click on this above โ๏ธ
- By default, the model with the best metric is deployed (even if it requires 10x compute and data for training, has 100x inference latency, and 0.001% improvement relative to the 2nd best).
- (Optional) Decide serving approach: Cache, microservice, or embedded in app? (By default, served via
csv
). - (Optional) Perform QA, integration testing, and stress testing to ensure optimal customer experience.
8. Validate the model online (i.e., A/B testing). Then, click here.
- Estimate effect size and decide on sample size required.
- Decide on random assignment condition: By customer, session, or product?
- Decide on attribution model: First touch, last touch, multi-touch, or no-touch?
- Decide on statistical approach: Frequentist, Bayesian, or Torturean?
9. Share the results. Then, click here.
- Share the best results (even if it's a warehouse optimization model but recommendation CTR goes up).
- Design fancy slides and label everything related to statistics and ML as Artificial Intelligenceโข.
- You're done! ๐
- ML models don't need to be refreshed or maintained. (But if you want unnecessary work, read this.)
There's no need toโthere's #nocode
! But if you want to contribute to the README, raise a PR.
Impossible! Our package has #nobugs
as it is #nocode
.
Yes.
It's partly (i) a joke, (ii) a point about the non-ML code related work, and (iii) a basic ML workflow.
- Add quick start
- Add no code style guide
- Add license
- Add unit tests
- Add code coverage checks
- Add lint checks
- Add type checks
- Add CI/CD pipeline
-
Build CLI for developer experience(out of scope as#nocode
)