Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isMAC() fixed #15

Merged
merged 19 commits into from Mar 7, 2024
Merged

isMAC() fixed #15

merged 19 commits into from Mar 7, 2024

Conversation

MeowMJ
Copy link
Collaborator

@MeowMJ MeowMJ commented Feb 17, 2024

After combineMulAdd(), the opcodeName changes. So isMAC() detects new opcodeName to verify whether it is MAC and isMAC() won't always return true.

@tancheng
Copy link
Owner

Which fiction fuse mul with add into new muladd?

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 17, 2024

Function combineMulAdd(). But it's not used in this mapper in fact.

@tancheng
Copy link
Owner

Function combineMulAdd(). But it's not used in this mapper in fact.

Can you please apply that combineMulAdd() on a fir kernel and show the two DFGs (before and after) here?

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 17, 2024

Function combineMulAdd(). But it's not used in this mapper in fact.

Can you please apply that combineMulAdd() on a fir kernel and show the two DFGs (before and after) here?

Fir kernel doesn't contain a mul and add that can be combined, So I show inttCore kernel (apply combine("mul","sub") before and after) instead.

I notice that combineMulAdd() is not as useful as combine("mul","add") or combine("mul","sub") or others, because it only combines Critical dfgNode.

Note that (19)mul and (21)sub are combined in InttAfterFuse.png and become (19)mulsub.

InttBeforeFuse.png
InttBeforeFuse
InttAfterFuse.png
InttAfterFuse

@tancheng
Copy link
Owner

tancheng commented Feb 17, 2024

  • Why 19 and 21 are combined rather than 20 and 21? Which pair has the higher priority to be combined in your opinion?
  • FIR kernel has a fmul followed by fadd. So that pair can also be combined, right?
  • The main purpose of this Pull Request (PR) is that when users enable MAC (or FMAC) in the GUI, all (or some of) the mul and add (or fmul and fadd) need to be combined.

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 17, 2024

  • I think (19)MulAdd means it is (19)Mul other than (20)Mul because it inherit the same number from (19)Mul.
  • I notice it too. I think it is because fadd has a circule with phi, making fmul and fadd can't be combined.
  • I will check the launchUI.py and rewrite the code (including the above two). Traveling to school soon and will have plenty of time for programming :D

@tancheng
Copy link
Owner

  • You didn't answer my first question. I was asking why 19 mul is chosen but 20 mul is not chosen.
  • Which line of code avoids such fusion?
  • Thanks for the contribution, take your time, and enjoy the holidays.

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 17, 2024

I will check the combine function later to answer your question, Sir. Right now, I have a busy time struggling with my relatives QAQ.

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 20, 2024

  • Why 19 and 21 are combined rather than 20 and 21? Which pair has the higher priority to be combined in your opinion?
  • FIR kernel has a fmul followed by fadd. So that pair can also be combined, right?
  • The main purpose of this Pull Request (PR) is that when users enable MAC (or FMAC) in the GUI, all (or some of) the mul and add (or fmul and fadd) need to be combined.
  • I print the chosen order of combine() function and find that 19 mul is analyzed before 20 mul, so it's chosen first.
  • I remove isCritical() from combineMulAdd(), so fmul can be combined with fadd now. Note that the changed code is not in this PR.
  • Since MAC is enabled by default, set combineMulAdd() positive will address your issue.

Below is Fir before and after combineMulAdd()(the one I remove judgment of critical nodes). Note that (6)mul and (7)add are combined and become (6)fmuladd.

firBeforeFuse.png
firBeforeFuse
firAfterFuse.png
firAfterFuse

@tancheng
Copy link
Owner

  • Task 1: Don't you think 20 should have higher priority to be combined? Any thoughts?
  • Task 2: Can you also provide a fusion function that fuse the add/compare/br/phi (i.e., 1/9/10/11) into single op? Similar to the add/phi (i.e., 0/7)?

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 21, 2024

  • Task 1: If 20 is combined, the number of nodes from (13)zext to (25)store is reduced. So 20 should have higher priority to be combined? But the total number of nodes is the same whether you combine 19 or 20.
  • Task 2: I write a new DFG::combineAddCmpBranchPhi() to address it. The result seems fine but I want to check it twice before submission.
    firAfterFuseCombineFour

@tancheng
Copy link
Owner

Nice.

  • I think there are priorities for fusion. The ones in the critical path has the highest priority, say in your attached figure, the phi is not able to be fused with add to decrease the RecMII to 1, because the add is fused with mul first. For the 13 to 25 case, the ones that can decrease the overall height should have higher priority (agree? This may help the mapping. But I think you don't need to implement this case for now). Please focus on increasing the priority of the fusion on critical path first.
  • Rename DFG::combineAddCmpBranchPhi() to combineForIter()? The function should also combine 0 and 7.
  • Can you try to provide some tests for your modifications?
    Thanks

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 22, 2024

So the purpose of combineForIter() is to combine any kind of critical path, no matter it's add/compare/br/phi or add/phi (those two are the most common ones but there exit other critical paths)? In that way, I need to change the logic of combineForIter().

I also want to know if this task is relevant to the mail you sent to us (Help needed ...) before or not.

@tancheng
Copy link
Owner

Yes, we want to fuse all the nodes within the critical cycle. However, we also want to make it parameterizable, for example, some cases, we may not want to over fuse them. So we can start from the above two cases I guess?

Yes, this is related to the mail. Fusion is a critical step towards better mapping.

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 22, 2024

OK, I will try to leave a parameter interface in combineForIter() and try to provide some tests for my modifications.

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 26, 2024

I made it in a recursive way, sir. But I want to confirm some questions.

  • only patterns on critical path will be combined?

@tancheng
Copy link
Owner

Cool, great progress. Recursive way is very popular for pattern recognition. I am fine with either recursive or iterative.
It would be perfect to also fuse the pattern in the non-critical paths. But if you think that is not trivial engineering effort, we can leave it to another PR.

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 27, 2024

DFG::combineIter() and DFG::combineForIter() is made to fuse the pattern in DFGs regardless the order or size of the pattern. I update the codes and you can merge the PR if there is no question.
Here is an example of DFG of fir after fusion where the pattern is "add -cmp-br-phi" and "phi-add".
firAfterFuse

@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 27, 2024

combineForIter() is to accept fusion pattern that user provides and it will call combineIter().
combineIter() is a recursive function that fuse patterns.

Here is an example of code of fusion pattern "add -cmp-br-phi" and "phi-add".

    list<string>* combinePattern1 = new list<string>[4]; 
    combinePattern1 -> push_back("add");
    combinePattern1 -> push_back("icmp");
    combinePattern1 -> push_back("br");
    combinePattern1 -> push_back("phi");
    list<DFGNode*>* cPatternNodes1 = new list<DFGNode*>[4]; 
    combineForIter(combinePattern1, cPatternNodes1);

    list<string>* combinePattern2 = new list<string>[2]; 
    combinePattern2 -> push_back("phi");
    combinePattern2 -> push_back("fadd");
    list<DFGNode*>* cPatternNodes2 = new list<DFGNode*>[2]; 
    combineForIter(combinePattern2, cPatternNodes2);

src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
DFG::combineForIter() combine the original two functions and can deal with one string pattern with multiple cycles.
@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Feb 28, 2024

I update combineForIter() and combineIter() into one function and this is an example code of how to use this function. The result DFG is the same as before.

    string* combinePattern1 = new string[4];
    combinePattern1[0] = "add";
    combinePattern1[1] = "icmp";
    combinePattern1[2] = "br";
    combinePattern1[3] = "phi";
    list<DFGNode*>* cPatternNodes1 = new list<DFGNode*>[4]; 
    combineForIter(combinePattern1, 4, cPatternNodes1);

    string* combinePattern2 = new string[2]; 
    combinePattern2[0] = ("phi");
    combinePattern2[1] = ("fadd");
    list<DFGNode*>* cPatternNodes2 = new list<DFGNode*>[2]; 
    combineForIter(combinePattern2, 2, cPatternNodes2);

Copy link
Owner

@tancheng tancheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try to click on Resolve conversation with appropriate comments (e.g., fixed, won't fix).

src/DFG.cpp Outdated Show resolved Hide resolved
I use list<string>* instead of string* and initialize t_matchedNodes inside the function.
@MeowMJ
Copy link
Collaborator Author

MeowMJ commented Mar 2, 2024

I update combineForIter() with one parameter and fix the name issue. The result DFG is the same as before.

    list<string>* combinePattern1 = new list<string>[4]; 
    combinePattern1 -> push_back("add");
    combinePattern1 -> push_back("icmp");
    combinePattern1 -> push_back("br");
    combinePattern1 -> push_back("phi");
    combineForIter(combinePattern1);


    list<string>* combinePattern2 = new list<string>[2]; 
    combinePattern2 -> push_back("phi");
    combinePattern2 -> push_back("fadd");
    combineForIter(combinePattern2);

src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
Fixed all your needs
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
src/DFG.cpp Outdated Show resolved Hide resolved
@tancheng tancheng merged commit ff5a806 into tancheng:master Mar 7, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants