Skip to content

Splitting with hashlib #628

@Dess1996

Description

@Dess1996

Good time a day ageron, you have a very nice book.

Anyway I have one question from Chapter 2. Which of algorithm given in book is more faster:

**- 1) def split_train_test(data, test_ratio)

    1. def test_set_check(identifier, test_ratio, id_column, hash=hashlib.md5)
    1. train_test_split from sklearn**

?

Which function is more save memory?

I would be gratefull if you give comments about this. Thanks for your attention

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions