### 读取带有注释的 json 文件并转换成 json 对象

标准 JSON 是不支持注释的。  
也就是说，如果 json 文件内容包含注释，json.loads 会解释失败并报错。

In [1]:
# 假设这是文件内容
file_content = """
  \\\\ test_Remove_Comments_in_JSON.json
{ \\\\ start !
    "a": 0, \\\\ "first" comment
    "b": "1",
    "c": " \\\\", \\\\ "third" comment
    "d": " http:\\\\",
    "e": "\\\\" \\\\ "ws:\\\\xx"
}
"""  
print("file content with comments:\n%s" % file_content)

file content with comments:

  \\ test_Remove_Comments_in_JSON.json
{ \\ start !
    "a": 0, \\ "first" comment
    "b": "1",
    "c": " \\", \\ "third" comment
    "d": " http:\\",
    "e": "\\" \\ "ws:\\xx"
}



In [2]:
import json
import traceback
try:
    json_obj = json.loads(file_content)
except:
    print(traceback.format_exc())

Traceback (most recent call last):
  File "<ipython-input-2-60ef3c7a449f>", line 4, in <module>
    json_obj = json.loads(file_content)
  File "d:\programs\python\python35\lib\json\__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "d:\programs\python\python35\lib\json\decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "d:\programs\python\python35\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 3 (char 3)



通过观察，可以发现：  
对于每一行，以双反斜杠作为注释符时，一对双反斜杠之前有偶数个双引号 是 该对双反斜杠之后的文本为注释 的充要条件。  
也就是说：  
对于每一行：一对双反斜杠之前有偶数个双引号 ↔ 该对双反斜杠之后的文本为注释 。  

下面根据此原理实现一个函数，找出一行文本中的注释部分。  
得到一行的注释部分，就可以将其替换为空，从而达到消除注释的目的。

In [3]:
def find_line_comment(line):
    """
    find the comment within a line
    :param line:
    :return:
    """
    splited_line = line.split("\\\\")
    quotation_count = 0
    splited_line_len = len(splited_line)
    
    if splited_line_len:
        for index, element in enumerate(splited_line):
            quotation_count += element.count('"')
            if quotation_count % 2 == 0:
                return ("\\\\" + "\\\\".join(splited_line[(index+1):])) if index < splited_line_len - 1 else ""
    return ""

测试：

In [4]:
file_content_lines = file_content.split("\n")  # 实际应用时，file_content 来自 open(file_path, "r") as f: f.readlines()
new_file_content_lines = []
for line in file_content_lines:
    line_comment = find_line_comment(line)
    if line_comment: print("line comment found: %s" % line_comment)
    new_line = line.replace(line_comment, "")
    new_file_content_lines.append(new_line)
new_file_content = "\n".join(new_file_content_lines)
print("new_file_content: %s" % new_file_content)

line comment found: \\ test_Remove_Comments_in_JSON.json
line comment found: \\ start !
line comment found: \\ "first" comment
line comment found: \\ "third" comment
line comment found: \\ "ws:\\xx"
new_file_content: 
  
{ 
    "a": 0, 
    "b": "1",
    "c": " \\", 
    "d": " http:\\",
    "e": "\\" 
}



消除注释后，转换成 json 对象：

In [5]:
json_obj = json.loads(new_file_content)
print("json_obj:\n%s" % json.dumps(json_obj, indent=4))

json_obj:
{
    "d": " http:\\",
    "c": " \\",
    "e": "\\",
    "b": "1",
    "a": 0
}


成功 ~