# DICOM信息匿名加密与解密 

####            @博动医学影像科技

医学检查隐私性监管要求对DICOM字头中有关患者身份的信息进行匿名化处理。本文档介绍影像平台相关的匿名化加密与解密方式。

使用pydicom包读取、修改和重装DICOM数据；使用xx包管理加密和解密过程。

In [None]:
import dicom as dcm
# import pydicom as dcm    # version 1.x


在进行匿名化处理时，保留原拷贝并新建匿名副本，匿名副本相对原拷贝所做的修改为：

- 新增标签：(0002, 0003) MediaStorageSOPInstanceUID，其值为文件存储路径的唯一映射
- 修改标签：(0008, 0018) SOPInstanceUID, 使其值等于(0002, 0003)
- 以下标签显示为空：(0008, 0020) StudyDate, (0008, 0023) ContentDate, (0008, 0030) StudyTime, (0008, 0033) ContentTime, 
- 删除以下标签：(0008, 0021) SeriesDate, (0008, 0022) AcquisitionDate, (0008, 0031) SeriesTime, (0008, 0032) AcquisitionTime, (0008, 0080) InstitutionName, (0008, 0081) InstitutionAddress, (0008,1040) InstitutionalDepartmentName, (0008,1048) PhysiciansOfRecord, (0008,1050) PerformingPhysicianName, (0008,1060) NameOfPhysiciansReadingStudy, (0008,1070) OperatorsName, (0008,1080) AdmittingDiagnosesDescription
- 修改标签：(0010, 0010) PatientName, 改为'Anonymized'+编号
- 修改标签：(0010, 0020) PatientID, 改为特定匿名ID
- 以下标签显示为空：(0010, 0030) PatientBirthDate, (0010, 0040) PatientSex, (0020, 0010) StudyID
- 删除以下标签：(0010, 0032) PatientBirthTime, (0010, 1001) OtherPatientNames, (0010, 1010) PatientAge, (0010, 1020) PatientSize, (0010, 1030) PatientWeight, (0010, 2160) EthnicGroup, (0010, 2180) Occupation, (0010, 21b0) AdditionalPatientHistory, (0010, 4000) PatientComments, (0018, 1000) DeviceSerialNumber, (0018, 1030) ProtocolName
- 新增标签：(0012, 0062) PatientIdentityRemoved, 值为YES
- 修改标签：(0020, 000d) StudyInstanceUID, (0020, 000e) SeriesInstanceUID, 分别改为特定匿名UID


载入示例DICOM文件

In [None]:
case = dcm.read_file("examples/LAD_IV_HyperView1.dcm")


执行匿名化修改

In [None]:
case.MediaStorageSOPInstanceUID = '1.2.3'
case.SOPInstanceUID = '1.2.3'
case.StudyDate = case.ContentDate = case.StudyTime = case.ContentTime = ''
case.PatientName = 'Anonymized1'
case.PatientID = '1.2.3.4'
case.PatientBirthDate = case.PatientSex = case.StudyID = ''
case.PatientIdentityRemoved = 'YES'
case.StudyInstanceUID = '1.2.3.4.5'
case.SeriesInstanceUID = '1.2.3.4.5.6'


In [None]:
del case.SeriesDate, case.AcquisitionDate, case.SeriesTime, case.AcquisitionTime, \
    case.InstitutionName, case.InstitutionAddress, case.InstitutionalDepartmentName, case.PhysiciansOfRecord, \
    case.PerformingPhysicianName, case.NameOfPhysiciansReadingStudy, case.OperatorsName, case.AdmittingDiagnosesDescription

del case.ImageTransformationMatrix, case.ImageTranslationVector #原文件格式错误

del case.PatientBirthTime, case.OtherPatientNames, case.PatientAge, case.PatientSize, \
    case.PatientWeight, case.EthnicGroup, case.Occupation, case.AdditionalPatientHistory, \
    case.PatientComments, case.DeviceSerialNumber, case.ProtocolName

case.remove_private_tags() #移除全部私有元素


 显示处理结果

In [None]:
# del case.ImageTransformationMatrix, case.ImageTranslationVector #原文件格式错误
case

保存结果

In [None]:
case.save_as("examples/" + case[0x0008, 0x1140][0][0x0008, 0x1155].value + ".dcm")

为DICOM文件进行匿名化加密时，可使用MD5方式对各级UID加密，使加密后仍返回一个独一码。

In [None]:
import hashlib

data = '1234'
data_md5 = hashlib.md5(data.encode('utf-8'))
print (data_md5.hexdigest())

MD5本身为不可逆加密。为了能够按匿名化UID取回原来的UID，需要在数据库中保存为一个条目，通过索引数据库的方式找回原UID。具体实现见项目代码。