-
Notifications
You must be signed in to change notification settings - Fork 92
Description
I have some texts from some authors. Each one has its own signature or link in the text.
For example author1:
text1:
sdsadsad daSDA DDASd asd aSD Sd dA SD ASD sadasdasds sadasd
@jhsad.sadas.com sdsdADSA sada
text2:
KDJKLFFD GFDGFDHGF GFHGFDHGFH GFHFGH Lklfgfd gdfsgfdsg df gfdhgf g
hfghghjh jhg @jhsad.sadas.com sfgff fsdfdsf
text3:
jhjkfsdg fdgdf sfds hgfj j kkjjfghgkjf hdkjtkj lfdjfg hkgfl
@jhsad.sadas.com dsfjdshflkds kg lsfdkg;fdgl
How can I find @jhsad.sadas.com
in the text?
EDIT:
@jhsad.sadas.com
is an example signature. I don't know what the real signatures of the authors might be! also it has not a format. it can be @jhsad.sadas.com
,or visit my blog in fsfsd.sfsf.dfssd
, or...
What I have is some text from the author and I know there is a unique signature from that author in their texts.
IDEA:
I thing with converting words to vectors and finding similarity between each texts, we can use cosine similarity to find the signatures.I thing the solution must be some thing like this idea.