-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tf.keras
version that allows any input resolution and doesn't use Lambda
layers
#16
base: master
Are you sure you want to change the base?
Conversation
|
||
resnet50_Weights_path='./pretrained_model/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5' | ||
IMAGE_ORDERING ='channels_last' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be passed as a variable, and ideally use tensorflow's default.
TODO: default to image_data_format
from keras' config (~/.keras/keras.json
)
|
||
resnet50_Weights_path='./pretrained_model/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5' | ||
IMAGE_ORDERING ='channels_last' | ||
MERGE_AXIS=-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should follow from data_format
, e.g. should always be the channel-axis.
return x | ||
|
||
def identity_block(input_tensor, kernel_size, filters, stage, block): | ||
def identity_block(input_tensor, kernel_size, filters, stage, block, data_format='channels_last'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added data_format
as a parameter everywhere, rather than the global constant IMAGE_FORMAT
. Also used the same name as tf
does for transparency.
|
||
x = ZeroPadding2D((3, 3), data_format=IMAGE_ORDERING)(img_input) | ||
x = Conv2D(64, (7, 7), data_format=IMAGE_ORDERING, strides=(2, 2),kernel_regularizer=l2(weight_decay), name='conv1')(x) | ||
class PadMultiple(Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pads to a multiple of 32
or whatever is specified in dims
.
padded_to_multiple = PadMultiple((32,32))(img_input) | ||
|
||
bn_axis = 3 if data_format == 'channels_last' else 1 | ||
merge_axis = 3 if data_format == 'channels_last' else 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is merge_axis
ever something else than the channel dimension?
|
||
if light_version: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate code complicates code maintenance, so I merged the light version in here.
Hi @prhbrt, thanks a lot for looking into this and for contributing! FYI, we are planning to update and refactor this repo and integrate the model training code with https://github.com/qurator-spk/eynollah for future maintenance, so this comes in very handy. My colleagues @vahidrezanezhad and @michalbubula will be working on this - although due to various reasons, we likely won't be able to get our hands dirty much before March. But we will try our best to review and merge any contributions also beforehand. Btw the two models that you were missing should be available from our HF:
|
@cneud Understood! Could you in the meantime provide a list of all model-architectures used for eynollah (specifically python code)? I couldn't find python-code for the column classifier in particular, so that one still has fixed dimensions. Also note that this yolo-version might give slightly different outputs as your patching example, due to boundary conditions. Thank you in advance! |
@prhbrt |
Since the UNET architecture only uses layers that can scale with the image dimensions, the fixed dimensions seem artificial. I've added a zero-padding layer that increases the dimensions to the nearest multiple of 32. The padding is cut off in the end.
Moreover, I've removed the lambda-layer, as it creates marshaling warning, and used
ZeroPadding2D
's asymmetric padding feature. This removes a warning uponload_model
.I converted the eynollah-models to this architecture, and they should load without warnings now and use the
tensorflow.keras
API and can be found here.This might allow you to skip the patching as used in Eynollah and speed up the whole process. Please let me know what you think.
Notes and sanity checks:
eynollah-main-regions_20220314
andeynollah-textline_light_20210425
, so I couldn't convert them.