added documentation #21

Aashu-Adhikari · 2022-03-24T03:43:24Z

added inline comments and docstrings to explain what the code is actually doing.

bhattbhuwan13

Please make the suggested changes

bhattbhuwan13 · 2022-03-24T03:46:43Z

rank_bm25.py

+        self.doc_freqs = []  # list of dictionaries of term_frequency of each document
+        self.idf = {}  # idf score of each word in whole corpus
+        self.doc_len = []  # list of length of each document in corpus
+        self.tokenizer = tokenizer  # user input tokenizer, defaults to none

        if tokenizer:
            corpus = self._tokenize_corpus(corpus)

        nd = self._initialize(corpus)


What is nd here? You should explain it

bhattbhuwan13 · 2022-03-24T03:48:13Z

rank_bm25.py

+        Example:
+            corpus = [['ram', 'is', 'a', 'good', 'boy'], ['ram', 'does', 'cycling', 'and', 'racing'], ['ram', 'is', 'healthy'], ['rita', 'likes', 'shyam'], ['good', 'luck']]
+            nd = {'ram': 3, 'is': 2, 'a': 1, 'good': 2, 'boy': 1, 'does': 1, 'cycling': 1, 'and': 1, 'racing': 1, 'healthy': 1, 'rita': 1, 'likes': 1, 'shyam': 1, 'luck': 1}


Shorten the examples so that I don't need to scroll. The functionality can also be explained only using 2 items in the list.

bhattbhuwan13 · 2022-03-24T03:50:39Z

rank_bm25.py

        for document in corpus:
            self.doc_len.append(len(document))
-            num_doc += len(document)
+            num_words += len(document)  # total number of words in whole corpus


The function of variable num_words has already been explained.

bhattbhuwan13 · 2022-03-24T03:51:33Z

rank_bm25.py

-            frequencies = {}
+            term_frequencies = (
+                {}
+            )  # term frequency of each word in a document........ changed frequencies to term_frequencies


You don't need to comment that you changed the name of variable. git keeps track of it.

bhattbhuwan13 · 2022-03-24T03:54:05Z

rank_bm25.py

+                if word not in term_frequencies:
+                    term_frequencies[word] = 0


This block of code can be removed by using defaultdict instead of the normal dictionary.

Aashu-Adhikari added 3 commits March 22, 2022 17:57

added comments to code

95ffbca

added inline comments

5d2bb29

added inline comments and generated docstring

9406d0b

bhattbhuwan13 reviewed Mar 24, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added documentation #21

added documentation #21

Aashu-Adhikari commented Mar 24, 2022

bhattbhuwan13 left a comment

bhattbhuwan13 Mar 24, 2022

bhattbhuwan13 Mar 24, 2022

bhattbhuwan13 Mar 24, 2022

bhattbhuwan13 Mar 24, 2022

bhattbhuwan13 Mar 24, 2022

added documentation #21

Are you sure you want to change the base?

added documentation #21

Conversation

Aashu-Adhikari commented Mar 24, 2022

bhattbhuwan13 left a comment

Choose a reason for hiding this comment

bhattbhuwan13 Mar 24, 2022

Choose a reason for hiding this comment

bhattbhuwan13 Mar 24, 2022

Choose a reason for hiding this comment

bhattbhuwan13 Mar 24, 2022

Choose a reason for hiding this comment

bhattbhuwan13 Mar 24, 2022

Choose a reason for hiding this comment

bhattbhuwan13 Mar 24, 2022

Choose a reason for hiding this comment