Skip to content

rtmigo/commie_py

 
 

Repository files navigation

PyPI version shields.io PyPI pyversions

Python package for extracting comments from source code.

Multiple programming and markup languages are supported: see list.

Install

$ pip3 install commie

Find comments in a file

from pathlib import Path
import commie

for comment in commie.iter_comments(Path("/path/to/source.cpp")):

    # something like "/* sample */"
    print("Comment code:", comment.code)
    print("Comment code location:", comment.code_span.start, comment.code_span.end)
    
    # something like " sample " 
    print("Comment inner text:", comment.text)
    print("Comment text location:", comment.text_span.start, comment.text_span.end)

Find comments in a string

Method Works for
commie.iter_comments_c C, C++, C#, Java, Objective-C, JavaScript, Dart, TypeScript
commie.iter_comments_go Go
commie.iter_comments_ruby Ruby
commie.iter_comments_python Python
commie.iter_comments_shell Bash, Sh
commie.iter_comments_html HTML, XML, SGML
commie.iter_comments_css CSS
commie.iter_comments_sass SASS
import commie

source_code_in_golang:str = ...

for comment in commie.iter_comments_go(source_code_in_golang):
    # ... process comment ...
    pass

Find comments in a string with a known filename

Method commie.iter_comments will try to guess the file format from the provided filename.

from pathlib import Path
import commie

filename: str = "/path/to/mycode.go"
source_code: str = Path(filename).read_text()

for comment in commie.iter_comments(source_code, filename=filename):
    # ... process comment ...
    pass

Group single line comments

When single-line comments are adjacent, it makes sense to consider them together:

// Group A: A short comment

// Group B: It consists of three
// single-line comments with 
// no empty lines between them

// Group C: This paragraph loosely 
// stretched into two lines  

The comments from the example above can be combined into three groups as follows:

from commie import iter_comments, group_singleline_comments

for group in group_singleline_comments(iter_comments(...)):
    # ... each group is a list of Comment objects ...
    pass

Multi-line comments will also be returned. They will not be grouped with their neighbors.

History

This project was forked from comment_parser in 2021. Motivation:

comment_parser commie
Returns only a line number Returns positions where the comment starts and ends. Just like regular string search
Returns only the text of a comment Respects markup as well, making it possible to remove or replace the entire comment
Depends on python-magic that requires an optional installation of binaries Pure Python. Easy to install with pip