Skip to content

Commit 70459cc

Browse files
committed
POS removal from hindi text
1 parent e41b2c3 commit 70459cc

File tree

6 files changed

+925
-0
lines changed

6 files changed

+925
-0
lines changed

Remove_POS_hindi_text/Input.png

108 KB
Loading

Remove_POS_hindi_text/Only_Hindi.txt

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

Remove_POS_hindi_text/Output.png

181 KB
Loading

Remove_POS_hindi_text/README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Package/Script Name
2+
3+
Short description of package/script
4+
5+
-->Package installed- NLKT
6+
- NLTK stands for 'Natural Language Tool Kit'. It consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition. NLTK helps the computer to analysis, preprocess, and understand the written text.
7+
8+
9+
## Setup instructions
10+
11+
--> Explanation on how to setup and run your package/script locally
12+
- simple import the NLKT package by writing 'import NLKT' in first line of your script.
13+
- To run the script locally save the 'Tagged_Hindi_Corpus.txt' file at your favourable location.
14+
- In code, in fp=open(r"..."), give the location of your saved file as mentioned in previous step.
15+
- In code, in fd=open(r"..."), give the location where you want the file with only Hindi text after removal of POS
16+
- Run the script with "python hindi_POS_tag_removal.py"
17+
- You will be able to see the file with only Hindi text.
18+
19+
## Detailed explanation of script, if needed
20+
21+
Script is written as follows:
22+
23+
- Open the hindi_tagged_corpus file.
24+
- Data tokenization.
25+
- Create 2 empty lists.
26+
- To get all categories from POS.
27+
- To get all the hindi words.
28+
- To concatenate the words.
29+
- To write the words in only_hindi file.
30+
31+
## Input
32+
33+
![Image](C:\Users\ZAVERI SANYA\Desktop\Amazing-Python-Scripts\Remove_POS_hindi_text\Input.png)
34+
35+
## Output
36+
![Image](C:\Users\ZAVERI SANYA\Desktop\Amazing-Python-Scripts\Remove_POS_hindi_text\Output.png)
37+
38+
39+
## Author(s)
40+
41+
- This code is written by Sanya Devansh Zaveri. [https://github.com/zaverisanya]
42+
43+
## Disclaimers, if any
44+
45+
There are no disclaimers for this script.

0 commit comments

Comments
 (0)