In [1]:
import nltk
import re
from nltk.corpus import stopwords

In [2]:
#programming fundamental 
sentence = """
Programming Fundamentals

Our modern digital creations are complex: a key job of programmers is to express that complexity as simply as possible.
"The best programs are written so that computing machines can perform them quickly and so that human beings can understand them clearly. A programmer is ideally an essayist who works with traditional aesthetic and literary forms as well as mathematical concepts, to communicate the way that an algorithm works and to convince a reader that the results will be correct." -- Donald Knuth

 Lesson 1: Some programming fundamentals
Humans have been programming computers for seven or so decades now. DevOps is a hot new topic in that practice, but the current popularity of the term does not mean we should ignore the many findings on how to write the best programs possible that came before DevOps should be ignored!

DRY 
This stands for Don't Repeat Yourself! It means that any part of your system that might ever need to change should have a single place where you can make the change. Don't copy blocks of code to wherever you need them in your program: write a function and call it from each of those places. Don't define your data tables in your database, and also in your code: find a way (like the Django models.py file) to define your data one place and use that definition to generate both the database and the code that uses the DB.
No magic constants. 
This is a special case of DRY. It is very tempting, when coding your NYU scheduling app, to write code assuming there are two (major) semesters per year. This will be fine... until NYU adopts a tri-mester system. Instead, define a constant NUM_SEMS = 2. You might get away with writing day_of_week = day mod 7, since that number probably will never change. But you really ought to write hour_of_day = hour mod CLOCK_PERIOD, since both 12 and 24 hour timekeeping methods exist.
Make functions do one job. 
Funcitons that perform a single job are simpler to understand, easier to change or eliminate, and render the overall system more comprehensible. For instance, if the county writes a tax program with a function called calc_taxes, it would be natural to eliminate that function if the job is later passed off to a microservice running on the cloud. But, if the coders also happened to include the code to clear tax liens (county claims against the property for unpaid taxes) in the same function... Oops! No one who ever had a tax lien can sell their property, because the lien never gets cleared.
Keep functions short. 
This is related to the previous principle, but focuses on the size of the one job that should be done. A function named handle_yearly_taxes() is doing one job, but probably way to big a job. It would make more sense to have create_tax_roll(), calculate_taxes(), send_bills(), record_payments(), and perhaps more.
Format and indent properly. 
Different languages have different conventions for how to name variables (camelCase, with_underscores, MixedCase, and so on), how to space operators, where to put braces, and so on. You should follow those conventions, unless there is a strong reason not to. Consistent indentation is especially important: it allows a reader of your code to easily line up blocks of control. Irregular indentation is a significant source of bugs, as people modifying the code will make mistakes, for example, about which else goes with which if.
Comment judiciously. 
Code should contain some comments, especially things like docstrings for classes that can be extracted to produce a guide to the system, and comments explaining what particularly tricky or unusual bits of code do. But commenting is no substitute for writing clear, readable code in the first place! The best explanation of what your code does is, if you write it correctly, your code itself. Remember that we could, and once did, write code just as a sequence of 1s and 0s. And all higher-level languages need to be translated into such code in the end. So why bother with C, Java, or Python? These languages exist for humans, not for computers: they make it easier for us to understand and reason about what a program will do. The upshot: you should look at your code as being every bit as much about communicating to humans as about directing a computer.
Go for the golden mean in naming. 
Sometimes, names of functions and variables can be way too cryptic: there are examples in the widely used CLRS Algorithm book where I have found as many as six single-letter variable names used at once. On the other hand, naming a function something like take_input_of_employee_w2_and_calculate_employee_tax_rate() is absurdly long: please remember, other programmers will have to type your function names in order to call your functions! Such immense names also make it extremely difficult to stay within guidelines like PEP 8's dictum of "no lines longer than 79 characters." A more reasonable middle ground might be something like calc_tax_rate(), where an employee's W2 might be a parameter for the function.
Test, test, test! 
Write an automated test to go with every program or new feature you write. Test as completely by hand as you can: don't just test that your code fetches the data from the DB correctly: test that it still works properly if there is no data in the DB, or, indeed, if there is no DB! ("Properly" here could mean "Display an informative error message instead of crashing.")
 Lesson 2: Python coding standards
For this lesson, please read the Python coding standard, PEP 8. It is a very good example of what a coding standard is like, and most of the guidelines can be applied in other languages. Our JavaScript team is choosing a standard at present, and soon we will link to that here as well.

 Other Readings
Programming Best Practices
Following Coding Standards using Flake8

"""



In [20]:
sentence_lst = nltk.sent_tokenize(sentence)
for i in range(0,len(sentence_lst)):
    sentence_lst[i] = re.sub('\n', '', sentence_lst[i])

print(sentence_lst)

['Programming FundamentalsOur modern digital creations are complex: a key job of programmers is to express that complexity as simply as possible.', '"The best programs are written so that computing machines can perform them quickly and so that human beings can understand them clearly.', 'A programmer is ideally an essayist who works with traditional aesthetic and literary forms as well as mathematical concepts, to communicate the way that an algorithm works and to convince a reader that the results will be correct."', '-- Donald Knuth Lesson 1: Some programming fundamentalsHumans have been programming computers for seven or so decades now.', 'DevOps is a hot new topic in that practice, but the current popularity of the term does not mean we should ignore the many findings on how to write the best programs possible that came before DevOps should be ignored!', "DRY This stands for Don't Repeat Yourself!", 'It means that any part of your system that might ever need to change should have a

In [8]:
def count_words(texts):
    stop_words = set(stopwords.words('english')) 
    words = nltk.word_tokenize(texts)
    # Remove single-character tokens (mostly punctuation)
    words = [word for word in words if len(word) > 1]

    # Remove numbers
    words = [word for word in words if not word.isdigit()]

    # Lowercase all words (default_stopwords are lowercase too)
    words = [word.lower() for word in words]
    
    words = [word for word in words if word not in stop_words]
    fdist = nltk.FreqDist(words)
    

    # Output top 50 words
    result = []
    for word,_ in fdist.most_common(len(words)):
        result.append(word)
    return  result
    


In [13]:
data = count_words(sentence)
for i in data:
    print(i)


code
write
function
job
test
make
like
coding
one
program
change
names
languages
``
data
might
programming
system
db
best
n't
way
''
functions
tax
standard
python
mean
place
need
humans
define
understand
works
lesson
properly
also
go
every
standards
guidelines
new
ever
never
dry
indentation
county
would
devops
call
hour
example
something
nyu
instead
blocks
remember
writing
easier
lien
conventions
since
reader
reason
could
computers
variables
programs
programmers
especially
clear
please
correctly
probably
many
property
different
perform
used
hand
database
well
...
possible
mod
pep
comments
eliminate
fundamentals
single
exist
naming
's
algorithm
results
per
microservice
hour_of_day
literary
send_bills
include
applied
magic
day_of_week
--
team
findings
consistent
follow
tri-mester
machines
stands
current
goes
ground
focuses
strong
allows
completely
calc_tax_rate
golden
named
app
modifying
cloud
use
readable
type
operators
concepts
following
control
topic
sense
scheduling
end
get
feature
h

In [10]:
def get_all_sentence_contain_word(word,sentence_lst):
    lst = []
    for i in sentence_lst:
        if word in i:
            temp_dic = {
                "title":word,
                "info":{
                    "page":"Programming Fundamentals",
                    "url":"http://127.0.0.1:8000/devops/basics",
                    "sentence":i
                }
            }
            lst.append(temp_dic)
            
    return lst

In [23]:
lst = []
for i in data:
    gan = get_all_sentence_contain_word(i,sentence_lst)
    for j in gan:
        lst.append(j)
print(lst)


[{'info': {'url': 'http://127.0.0.1:8000/devops/basics', 'page': 'Programming Fundamentals', 'sentence': "Don't copy blocks of code to wherever you need them in your program: write a function and call it from each of those places."}, 'title': 'code'}, {'info': {'url': 'http://127.0.0.1:8000/devops/basics', 'page': 'Programming Fundamentals', 'sentence': "Don't define your data tables in your database, and also in your code: find a way (like the Django models.py file) to define your data one place and use that definition to generate both the database and the code that uses the DB."}, 'title': 'code'}, {'info': {'url': 'http://127.0.0.1:8000/devops/basics', 'page': 'Programming Fundamentals', 'sentence': 'It is very tempting, when coding your NYU scheduling app, to write code assuming there are two (major) semesters per year.'}, 'title': 'code'}, {'info': {'url': 'http://127.0.0.1:8000/devops/basics', 'page': 'Programming Fundamentals', 'sentence': 'But, if the coders also happened to in