Add Processing to replace OCRProcessing #13

Closed
jukervin opened this Issue Feb 20, 2014 · 3 comments

Comments

Projects
None yet
4 participants
@jukervin
Member

jukervin commented Feb 20, 2014

The current process recording elements are fixed with OCR and on the other hand bit redundand. I think it would make sense to change OCRProcessing to Processing and the preProcessingStep,ocrProcessingStep, postProcessingStep to generic processingStep with processingStepType element to record the type of processing performed.

Currently:

<OCRProcessing ID="OCRPROCESSING_1">
  <preProcessingStep>
    <processingDateTime>2009-10-19</processingDateTime>
    <processingAgency>CCS Content Conversion Specialists GmbH, 
    </processingAgency>
    <processingStepDescription>align</processingStepDescription>
    <processingStepSettings>CCS OCR Processing Filter</processingStepSettings>
     <processingSoftware>
         <softwareCreator>CCS Content Conversion Specialists GmbH,Germany</softwareCreator>
         <softwareName>CCS docWORKS</softwareName>
         <softwareVersion>6.3-0.91</softwareVersion>
         <applicationDescription/>
       </processingSoftware>
    </preProcessingStep>
    <ocrProcessingStep>
    <processingSoftware>
    <softwareCreator>ABBYY (BIT Software), Russia</softwareCreator>
      <softwareName>FineReader</softwareName>
      <softwareVersion>8.1</softwareVersion>
    </processingSoftware>
  </ocrProcessingStep>
</OCRProcessing>

Suggestion

<Processing>
  <ProcessingStep ID="01">
    <processingDateTime>2009-10-19T10:10:10+05:00</processingDateTime>
    <processingStepType>image processing</processingStepType>
    <processingAgency>ACME Processing</processingAgency>
    <processingStepDescription>align</processingStepDescription>
    <processingStepSettings>ACME OCR Processing Filter</processingStepSettings>
    <processingSoftware>
      <softwareCreator>CCS Content Conversion Specialists GmbH, Germany</softwareCreator>
      <softwareName>CCS docWORKS</softwareName>
      <softwareVersion>6.3-0.91</softwareVersion>
      <softwareDescription/>
    </processingSoftware>
  </ProcessingStep>
  <ProcessingStep ID="02">
    <processingDateTime>2009-10-19T10:21:14+05:00</processingDateTime>
    <processingStepType>OCR</processingStepType>
    <processingAgency>CCS Content Conversion Specialists GmbH, www.content-conversion.com</processingAgency>
    <processingStepDescription></processingStepDescription>
    <processingStepSettings></processingStepSettings>
    <processingSoftware>
      <softwareCreator>ABBYY (BIT Software), Russia</softwareCreator>
      <softwareName>FineReader</softwareName>
      <softwareVersion>8.1</softwareVersion> 
      <softwareDescription/>
    </processingSoftware>
  </ProcessingStep>
  <ProcessingStep ID="03">
     <processingDateTime>2009-10-19T15:28:30+05:00</processingDateTime>
     <processingStepType>Proofreading</processingStepType>
     <processingAgency>ACME Corp.</processingAgency>
     <processingStepDescription></processingStepDescription>
     <processingStepSettings></processingStepSettings>
     <processingSoftware>
        <softwareCreator>ACME</softwareCreator>
        <softwareName>Proofreader</softwareName>
       <softwareVersion>9.9</softwareVersion>
       <softwareDescription/>
     </processingSoftware>
   </ProcessingStep>
</Processing>

Schema changes:

<xsd:element name="OCRProcessing" minOccurs="0" maxOccurs="unbounded">
+  <xsd:annotation>
+    <xsd:documentation>DEPRECATED: Processing element should be used instead. 
+  </xsd:documentation>
 <xsd:complexType>
   <xsd:complexContent>
     <xsd:extension base="ocrProcessingType">
       <xsd:attribute name="ID" type="xsd:ID" use="required"/>
     </xsd:extension>
   </xsd:complexContent>
</xsd:complexType>


+<xsd:element name="Processing" minOccurs="0" maxOccurs="unbounded">
+  <xsd:complexType>
+     <xsd:complexContent>
+       <xsd:extension base="ProcessingStepType">
+         <xsd:attribute name="ID" type="xsd:ID" use="required"/>
+       </xsd:extension>
+      </xsd:complexContent>
+  </xsd:complexType>


<xsd:complexType name="ProcessingStepType">
<xsd:annotation> 
  <xsd:documentation>A processing step.</xsd:documentation>
</xsd:annotation>
 <xsd:sequence>

+  <xsd:element name="processingStepType" type="xsd:string" minOccurs="0"> 
+   <xsd:annotation>
+    <xsd:documentation>Type of processing step</xsd:documentation>
+   </xsd:annotation>
+  </xsd:element>

  <xsd:element name="processingDateTime" type="dateTimeType" minOccurs="0"> 
   <xsd:annotation>
    <xsd:documentation>Date or DateTime the image was processed.</xsd:documentation>
   </xsd:annotation>
  </xsd:element>
  <xsd:element name="processingAgency" type="xsd:string" minOccurs="0">
   <xsd:annotation>
    <xsd:documentation>Identifies the organizationlevel producer(s) of the
      processed image.</xsd:documentation>
   </xsd:annotation>
  </xsd:element>
  <xsd:element name="processingStepDescription" type="xsd:string" minOccurs="0" maxOccurs="unbounded">
   <xsd:annotation>
    <xsd:documentation>An ordinal listing of the image processing steps performed.
        For example, "image despeckling."</xsd:documentation>
   </xsd:annotation>
  </xsd:element>
  <xsd:element name="processingStepSettings" type="xsd:string" minOccurs="0">
   <xsd:annotation>
    <xsd:documentation>A description of any setting of the processing application.
        For example, for a multi-engine OCR application this might include the
        engines which were used. Ideally, this description should be adequate so
        that someone else using the same application can produce identical
        results.</xsd:documentation>
   </xsd:annotation>
  </xsd:element>
  <xsd:element name="processingSoftware" type="processingSoftwareType" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType> 

@jukervin jukervin added this to the 2.2 milestone Feb 20, 2014

@jukervin jukervin self-assigned this Feb 20, 2014

@jukervin jukervin removed this from the 2.2 milestone Jun 5, 2014

@jukervin jukervin changed the title from Change OCRProcessing to Processing to Add Processing to replace OCRProcessing Jun 5, 2014

@Jo-CCS Jo-CCS added the 2 discussion label Sep 10, 2014

@jukervin jukervin modified the milestone: 3.1 Dec 11, 2014

@Jo-CCS Jo-CCS referenced this issue Mar 23, 2016

Closed

Glyphs (IMPACT) #26

@cneud

This comment has been minimized.

Show comment
Hide comment
@cneud

cneud Jun 14, 2016

Member

This seems very sensible to me!

Having a generic Processing and an ID attribute for a processingStep would seem to me to also satisfy much of what has been requested in #35. What it is still missing though is a way to track, which exact elements have been produced or altered by a particular processingStep.

Member

cneud commented Jun 14, 2016

This seems very sensible to me!

Having a generic Processing and an ID attribute for a processingStep would seem to me to also satisfy much of what has been requested in #35. What it is still missing though is a way to track, which exact elements have been produced or altered by a particular processingStep.

@Jo-CCS

This comment has been minimized.

Show comment
Hide comment
@Jo-CCS

Jo-CCS Jun 16, 2016

Member

To track the changes of element will be imposisble to cover within an XML file, as XML is hierarchical structured and the change by (post-)processing actions will also cause change of hiararchy, which cannot be recorded. Also elements might be removed which then cannot be referenced any more.
In such case it makes much more sense to clone files, just add the history recordings to know which file has which status and to compare.
Storage managements systems do the rest to prevent full redundant data holding by just saving the changes and keep ability to roll back to former version.

Member

Jo-CCS commented Jun 16, 2016

To track the changes of element will be imposisble to cover within an XML file, as XML is hierarchical structured and the change by (post-)processing actions will also cause change of hiararchy, which cannot be recorded. Also elements might be removed which then cannot be referenced any more.
In such case it makes much more sense to clone files, just add the history recordings to know which file has which status and to compare.
Storage managements systems do the rest to prevent full redundant data holding by just saving the changes and keep ability to roll back to former version.

@cneud cneud referenced this issue Jun 16, 2016

Closed

Processing history #39

6 of 6 tasks complete
@cneud

This comment has been minimized.

Show comment
Hide comment
@cneud

cneud Jun 16, 2016

Member

Continued in #39.

Member

cneud commented Jun 16, 2016

Continued in #39.

@cneud cneud closed this Jun 16, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment