Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/experiment #201

Merged
merged 33 commits into from
Oct 31, 2022
Merged

Feature/experiment #201

merged 33 commits into from
Oct 31, 2022

Conversation

Atahanak
Copy link
Collaborator

Initial PR for the Experiment namespace. The experiment infrastructure is implemented along with some additional example functions for the user and two lib examples.

Experiment tutorial and advanced testing left to another PR.

@Atahanak Atahanak added this to the Milestone 3 milestone Oct 10, 2022
@Atahanak Atahanak linked an issue Oct 10, 2022 that may be closed by this pull request
@Atahanak Atahanak added priority: now Critical priority type: feature Brand new functionality, features, workflows, endpoints, etc labels Oct 10, 2022
@AmroAlJundi AmroAlJundi changed the base branch from main to develop October 11, 2022 11:13
@AmroAlJundi AmroAlJundi removed this from the Milestone 3 milestone Oct 17, 2022
Copy link
Contributor

@AmroAlJundi AmroAlJundi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. But I have some comments, requests, and discussions :)

  • Add some tests if possible.

  • Make LoadDataFunction an std::function rather than a function pointer. This allows for us to pass formats to the loader from main as shown here (I compiled this and it worked):
    image
    Note: this would require the _dataLoaders member to become vector<LoadDataFunction>.

  • Make LoadDataFunction take a map of strings->strings, vector of strings, or a map string->any.

    • Maybe additional strings can be part of the std::any parameter in the pair?
  • I like how the file parameters are passed to the kernel. I think we should generalize this further:

    1. Just like how files have an associated std::any parameter, I think preprocessing should have one. Kernel functions would take both the file params and the preprocess params.
    2. Pass the file params object to the preprocessing function.

    In other words, a file will have its params, a preprocess its params, and a kernel its params. A loader functions will know about the params of its input, a preprocess function will know about its params and the params of the file it's reading, and a kernel function will know about its params, the preprocess params used for this iteration, and the params of the input it's using.

    Just to clarify, I mean that the same file params that you pass to the kernel function you pass to the preprocess and loaded function, and the same preprocess param you pass to the kernel you pass to the preprocessing, i.e. a single param object per function.

  • The names of data members should have the underscore at the end of the name not the beginning.

  • I think we should have a function that generates IDs. This way, the user can generate the ID himself in case he wanted to query something. This is not that important though.

@Atahanak Atahanak linked an issue Oct 24, 2022 that may be closed by this pull request
docs/pages/tutorials/003_experiment.md Outdated Show resolved Hide resolved
docs/pages/tutorials/003_experiment.md Outdated Show resolved Hide resolved
docs/pages/tutorials/003_experiment.md Show resolved Hide resolved
docs/pages/tutorials/003_experiment.md Outdated Show resolved Hide resolved
tutorials/003_experiment/start_here/tutorial_003.cc Outdated Show resolved Hide resolved
AmroAlJundi
AmroAlJundi previously approved these changes Oct 31, 2022
Copy link
Contributor

@AmroAlJundi AmroAlJundi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Miniscule changes were requested, but they're not that important.

exp.AddPreprocess("RCM", experiment::Reorder<RCMReorder, CSR, CPUContext, row_type, nnz_type, value_type>, params);
```

As the first parameter of the `AddPreprocess` function we provide an identifier that is necessary to distinguish the results, auxiliary data, and runtimes generated as a result of the experiment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can be more specific here that this identifier is for the runtimes, results, and aux data of experiments that use this specific preprocessing. It sort of feels vague to me here 😅

docs/pages/tutorials/003_experiment.md Show resolved Hide resolved
@Atahanak Atahanak merged commit 6259c14 into develop Oct 31, 2022
@Atahanak Atahanak deleted the feature/Experiment branch October 31, 2022 16:52
SinanEkm pushed a commit that referenced this pull request Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: now Critical priority type: feature Brand new functionality, features, workflows, endpoints, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write expriment tutorial Design the performance measruement pipeline (experiment pipeline)
2 participants