-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Just to verify my test result #8
Comments
Yes, the results look like it is supposed to!
Glad you figured it out by yourself! Nice work!
…On Thu, Oct 24, 2019, 4:22 PM ywu-stats ***@***.***> wrote:
Hi,
I was finally able to run it through with the sample data.
I just wanted to verify that my result look as expected(attached
visualization). Very cool visualization though!
[image: test]
<https://user-images.githubusercontent.com/56888960/67532216-54c2b400-f67a-11e9-9525-4df95579f4a2.png>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8?email_source=notifications&email_token=AAL6WMC2Z7JBXC54PIJWFMDQQIU2TA5CNFSM4JE4UIT2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HUH6W3A>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAL6WMDGSUHSAUQS543DJ33QQIU2TANCNFSM4JE4UITQ>
.
|
Now you can tell how badly I wanted to make use of this. |
It's actually fairly easy to get the userid for each cluster. If you look at the visulization.py, you can see a function called As to the ngrams, you can look at line 90-91 in visulization.py. You might need a bit of knowledge in python to modify the code to suit your own needs, but it should be fairly straightforward. |
Thank you for the information! I'm indeed learning Python recently :)
Another question I have is that where can I change the length of ngrams? I
remember in your publication you mentioned 5, but I only see one action per
feature in different test case I have.
…On Mon, Oct 28, 2019 at 11:06 PM Xinyi Zhang ***@***.***> wrote:
It's actually fairly easy to get the userid for each cluster. If you look
at the visulization.py, you can see a function called allUser, which
takes in a tree/sub-tree and returns all the users in it. For ways to
traverse the tree structure, you can look at line 59-79 in visulization.py.
As to the ngrams, you can look at line 90-91 in visulization.py.
You might need a bit of knowledge in python to modify the code to suit
your own needs, but it should be fairly straightforward.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#8?email_source=notifications&email_token=ANSA5AHF7GNMKL6IQYIXXCTQQ7HGRA5CNFSM4JE4UIT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECPLEGI#issuecomment-547271193>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANSA5AFD5NV7ETTKUIZGZDDQQ7HGRANCNFSM4JE4UITQ>
.
|
Yes, you can indeed do any length you want. It's just a part of preprocessing which is not included in this code, which means you need to be able to write some preprocessing code. |
Hmmm...seems like it's the predefined input structure? Then I do want to clarify something about the methodology in the publication. My understanding was, the whole feature space is a union set of all possible Ngrams and the values are count of each Ngram appeared in whole path at userid level. For example, from the path of ABCDEFG, if I set N-grams N=3 I should look at features={ABC,BCD,CDE,DEF,...EFG}, right? |
Yes, the input format should be ABC()BCD()CDE(). This is because this
github repo is intended to be more general purpose than what is described
in the paper.
Hope this answers your question!
…On Tue, Oct 29, 2019, 10:58 AM ywu-stats ***@***.***> wrote:
Hmmm...seems like it's the predefined input structure? Then I do want to
clarify something about the methodology in the publication. My
understanding was, the whole feature space is a union set of all possible
Ngrams and the values are count of each Ngram appeared in whole path at
userid level.
For example, from the path of ABCDEFG, if I set N-grams N=3 I should look
at {ABC,BCD,CDE,DEF,...EFG}, right?
So you are saying, this Ngram is part of the data processing step and ABC
etc. are predefined in input data. I'm confused about how I should format
my input. Is it ABC()BCD()CDE()...?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8?email_source=notifications&email_token=AAL6WMB7DY3OUNM5D5Y4MHDQRB2VXA5CNFSM4JE4UIT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECRP7SI#issuecomment-547553225>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAL6WMBOMHO6VINHI4LTHMTQRB2VXANCNFSM4JE4UITQ>
.
|
I see, thanks! |
Hi @xychang Ex: A(5)B(7) |
Hi @Enforcer007, when we say 5-gram, it actually includes the timegap. |
Hi @xychang Thanks for responding. I have 2 questions: Q1: Then what wud be T3(Sequence): T3(Sequence) = {(S1g1S2),(g1,S2,g2),(S2,g2,S1),......} Q2: Thanks |
So, in our implementation, we actually included both 3 grams and 5 grams.
We found it to be helpful in practice.
…On Tue, Nov 19, 2019, 11:22 PM Akhil a.k.a Enforcer007 < ***@***.***> wrote:
Hi @xychang <https://github.com/xychang>
Thanks for responding. I have 2 questions:
*Q1:*
Consider we go for 3 gram and below is the click stream:
Sequence = S1g1S2g2S1g1S3g1S4g2S2g3S4g1S1
Then what wud be T3(Sequence):
T3(Sequence) = {(S1g1S2),(g1,S2,g2),(S2,g2,S1),......}
OR
T3(Sequence) = {(S1g1S2),(S2,g2,S1),(S1,g1,S3),......}
*Q2*:
When you say it's 5 gram. I see in the visualisation there is a 3 gram
pattern. Can you please explain.
[image: doubt]
<https://user-images.githubusercontent.com/6951100/69217376-8e60df00-0b94-11ea-8db9-85246448de06.png>
Thanks
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8?email_source=notifications&email_token=AAL6WMDKRE6YHE7PX6FLM7LQUTQT5A5CNFSM4JE4UIT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEQ7L4Y#issuecomment-555873779>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAL6WMHMSNJKP4BRD3Y6IXDQUTQT5ANCNFSM4JE4UITQ>
.
|
K, that's gr8. Can you pls confirm on Q1 Thanks |
For q1, the answer would be the latter.
…On Tue, Nov 19, 2019, 11:22 PM Akhil a.k.a Enforcer007 < ***@***.***> wrote:
Hi @xychang <https://github.com/xychang>
Thanks for responding. I have 2 questions:
*Q1:*
Consider we go for 3 gram and below is the click stream:
Sequence = S1g1S2g2S1g1S3g1S4g2S2g3S4g1S1
Then what wud be T3(Sequence):
T3(Sequence) = {(S1g1S2),(g1,S2,g2),(S2,g2,S1),......}
OR
T3(Sequence) = {(S1g1S2),(S2,g2,S1),(S1,g1,S3),......}
*Q2*:
When you say it's 5 gram. I see in the visualisation there is a 3 gram
pattern. Can you please explain.
[image: doubt]
<https://user-images.githubusercontent.com/6951100/69217376-8e60df00-0b94-11ea-8db9-85246448de06.png>
Thanks
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8?email_source=notifications&email_token=AAL6WMDKRE6YHE7PX6FLM7LQUTQT5A5CNFSM4JE4UIT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEQ7L4Y#issuecomment-555873779>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAL6WMHMSNJKP4BRD3Y6IXDQUTQT5ANCNFSM4JE4UITQ>
.
|
Hi,
I was finally able to run it through with the sample data.
![test](https://user-images.githubusercontent.com/56888960/67532216-54c2b400-f67a-11e9-9525-4df95579f4a2.png)
I just wanted to verify that my result look as expected(attached visualization). Very cool visualization though!
The text was updated successfully, but these errors were encountered: