You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I initially modified a JSON example directly and found this issue, but I think showing it from the Tab side is clearer.
I modified the BII-I-1 Tab example so that the first culture has a different protocol than the rest. This validates and converts to JSON without issues. If I try to convert that JSON back to Tab though there is an issue caused by the different protocol.
isa_json = isatab2json.convert('C:/Users/Sparda/Desktop/Moseley Lab/Code/MESSES/isadatasets/tab/BII-I-1_conversion_testing', use_new_parser=True)
with open('C:/Users/Sparda/Desktop/Moseley Lab/Code/MESSES/isadatasets/BII-I-1_testing.json', 'w') as out_fp:
json.dump(isa_json, out_fp, indent=2)
with open('C:/Users/Sparda/Desktop/Moseley Lab/Code/MESSES/isadatasets/BII-I-1_testing.json') as file_pointer:
json2isatab.convert(file_pointer, 'C:/Users/Sparda/Desktop/Moseley Lab/Code/MESSES/isadatasets/BII-I-1_testing/', validate_first=False)
Traceback:
Traceback (most recent call last):
File "C:\Users\Sparda\AppData\Local\Temp\ipykernel_5600\1208495759.py", line 5, in <cell line: 4>
json2isatab.convert(file_pointer, 'C:/Users/Sparda/Desktop/Moseley Lab/Code/MESSES/isadatasets/BII-I-1_testing/', validate_first=False)
File "C:\Python310\lib\site-packages\isatools\convert\json2isatab.py", line 49, in convert
isatab.dump(isa_obj=isa_obj, output_path=path, i_file_name=i_file_name,
File "C:\Python310\lib\site-packages\isatools\isatab\dump\core.py", line 170, in dump
write_study_table_files(investigation, output_path)
File "C:\Python310\lib\site-packages\isatools\isatab\dump\write.py", line 134, in write_study_table_files
df_dict[olabel][-1] = node.executes_protocol.name
KeyError: 'Protocol REF.growth protocol 2'
I investigated the error and it seems to come from identifying process nodes by the protocol they execute instead of by their position like is done with sample nodes in the same section of code. I think I was able to fix it by simply changing the process node code to be like the sample node code.
New Code:
sample_in_path_count = 0
protocol_in_path_count = 0
longest_path = _longest_path_and_attrs(paths, s_graph.indexes)
for node_index in longest_path:
node = s_graph.indexes[node_index]
if isinstance(node, Source):
olabel = "Source Name"
columns.append(olabel)
columns += flatten(
map(lambda x: get_characteristic_columns(olabel, x),
node.characteristics))
columns += flatten(
map(lambda x: get_comment_column(
olabel, x), node.comments))
elif isinstance(node, Process):
olabel = "Protocol REF.{}".format(protocol_in_path_count)
columns.append(olabel)
protocol_in_path_count += 1
if node.executes_protocol.name not in protnames.keys():
protnames[node.executes_protocol.name] = protrefcount
protrefcount += 1
columns += flatten(map(lambda x: get_pv_columns(olabel, x),
node.parameter_values))
if node.date is not None:
columns.append(olabel + ".Date")
if node.performer is not None:
columns.append(olabel + ".Performer")
columns += flatten(
map(lambda x: get_comment_column(
olabel, x), node.comments))
elif isinstance(node, Sample):
olabel = "Sample Name.{}".format(sample_in_path_count)
columns.append(olabel)
sample_in_path_count += 1
columns += flatten(
map(lambda x: get_characteristic_columns(olabel, x),
node.characteristics))
columns += flatten(
map(lambda x: get_comment_column(
olabel, x), node.comments))
columns += flatten(map(lambda x: get_fv_columns(olabel, x),
node.factor_values))
omap = get_object_column_map(columns, columns)
# load into dictionary
df_dict = dict(map(lambda k: (k, []), flatten(omap)))
for path_ in paths:
for k in df_dict.keys(): # add a row per path
df_dict[k].extend([""])
sample_in_path_count = 0
protocol_in_path_count = 0
for node_index in path_:
node = s_graph.indexes[node_index]
if isinstance(node, Source):
olabel = "Source Name"
df_dict[olabel][-1] = node.name
for c in node.characteristics:
category_label = c.category.term if isinstance(c.category.term, str) \
else c.category.term["annotationValue"]
clabel = "{0}.Characteristics[{1}]".format(
olabel, category_label)
write_value_columns(df_dict, clabel, c)
for co in node.comments:
colabel = "{0}.Comment[{1}]".format(olabel, co.name)
df_dict[colabel][-1] = co.value
elif isinstance(node, Process):
olabel = "Protocol REF.{}".format(
protocol_in_path_count)
df_dict[olabel][-1] = node.executes_protocol.name
for pv in node.parameter_values:
pvlabel = "{0}.Parameter Value[{1}]".format(
olabel, pv.category.parameter_name.term)
write_value_columns(df_dict, pvlabel, pv)
if node.date is not None:
df_dict[olabel + ".Date"][-1] = node.date
if node.performer is not None:
df_dict[olabel + ".Performer"][-1] = node.performer
for co in node.comments:
colabel = "{0}.Comment[{1}]".format(olabel, co.name)
df_dict[colabel][-1] = co.value
elif isinstance(node, Sample):
olabel = "Sample Name.{}".format(sample_in_path_count)
sample_in_path_count += 1
df_dict[olabel][-1] = node.name
for c in node.characteristics:
category_label = c.category.term if isinstance(c.category.term, str) \
else c.category.term["annotationValue"]
clabel = "{0}.Characteristics[{1}]".format(
olabel, category_label)
write_value_columns(df_dict, clabel, c)
for co in node.comments:
colabel = "{0}.Comment[{1}]".format(olabel, co.name)
df_dict[colabel][-1] = co.value
for fv in node.factor_values:
fvlabel = "{0}.Factor Value[{1}]".format(
olabel, fv.factor_name.name)
write_value_columns(df_dict, fvlabel, fv)
This is approximately lines 64-167 in isatools\isatab\dump\write.py in the write_study_table_files function. The changed code no longer errors and the converted study Tab from the JSON looks correct to me.
The text was updated successfully, but these errors were encountered:
ptth222
added a commit
to ptth222/isa-api
that referenced
this issue
Sep 6, 2023
Changed write_study_table_files and write_assay_table_files to count the protocol nodes instead of naming them by the protocol executed. Addresses issue ISA-tools#501.
I initially modified a JSON example directly and found this issue, but I think showing it from the Tab side is clearer.
I modified the BII-I-1 Tab example so that the first culture has a different protocol than the rest. This validates and converts to JSON without issues. If I try to convert that JSON back to Tab though there is an issue caused by the different protocol.
Modified study and investigation files:
s_BII-S-1.txt
i_investigation.txt
Code:
Traceback:
I investigated the error and it seems to come from identifying process nodes by the protocol they execute instead of by their position like is done with sample nodes in the same section of code. I think I was able to fix it by simply changing the process node code to be like the sample node code.
New Code:
This is approximately lines 64-167 in isatools\isatab\dump\write.py in the write_study_table_files function. The changed code no longer errors and the converted study Tab from the JSON looks correct to me.
The text was updated successfully, but these errors were encountered: