Add new property/value mailbox channel #47

Closed
popcornmix opened this Issue Jun 17, 2012 · 121 comments

Comments

Projects
None yet
7 participants
Contributor

popcornmix commented Jun 17, 2012

Suggestion for property/value message passing interface from ARM to GPU.

We have a generic message passing structure. The first word is always the property of interest (could be small integers or fourcc codes).
The remainder of the structure is property specific data whose length depends on the property. This data is used for both input and output data.

define MBOX_CHAN_GETPROP 8

define MBOX_CHAN_SETPROP 9

// generic property structure
struct property_s {
uint32_t property;
uint32_t data[];
}

// get property
void *property = <>
uint32_t success;
mbox_write(MBOX_CHAN_GETPROP, property);
mbox_read(MBOX_CHAN_GETPROP, &success);

// set property
void *property = <>
uint32_t success;
mbox_write(MBOX_CHAN_SETPROP, property);
mbox_read(MBOX_CHAN_SETPROP, &success);

Example:

define CLOCK_EMMC 1

struct property_one_word_s {
uint32_t property;
uint32_t value;
} *emmc_clock;

uint32_t success;
emmc_clock =
emmc_clock->property = CLOCK_EMMC;
emmc_clock->value = 50000000;
_wmb();
mbox_write(MBOX_CHAN_SETPROP, emmc_clock);
mbox_read(MBOX_CHAN_SETPROP, &success);
_rmb();

We could also use this to handle the existing framebuffer allocation and power control in a more consistent way.

Any config.txt properties can be retrieved with this mechanism. E.g.
struct property_one_word_one_string_s {
uint32_t property;
uint32_t value;
uint8_t string[MAX_STRING];
} *config_txt;

Similarly cmdline.txt and other ATAGs (e.g. memory size) can be got.

Properties that can be got or set:
CLOCK_EMMC
CLOCK_UART
CLOCK_ARM
CLOCK_CORE
CLOCK_V3D
CLOCK_H264
CLOCK_ISP
POWER_CONTROL
FRAMEBUFFER

Properties that can be got:
CMDLINE
CONFIG_STRING
SERIAL_NUM
BOARD_REV
MAC_ADDRESS
DISPLAY_WIDTH
DISPLAY_HEIGHT
DISPLAY_DEPTH
DMA_CHANS
MEMORY_SIZE

Any additional suggestions?

lp0 commented Jun 17, 2012

I'd suggest including the length of each property and the size of the request/response buffer, so that it's not limited to predefined 32 bit values. The buffer could also hold multiple properties in a TLV format.

For command line, serial number and board revision I think these would arrive too late in the boot process if the mbox driver had to be used to get them. The command line and memory size/reserved memory must be passed through the device tree file. As the MAC address is static it can go in the device tree file too.

For power control the easiest option is to define a unique 32-bit identifier for each device that can be turned on or off, as this seems to work quite well in the device tree file (currently I use the value to define the bit of the power state array). I'd also prefer to always get a response, it's easier to return success/failure that way...

For the clocks there should be properties for getting the current state (enabled/disabled), rate, min rate, max rate. Instead of having to define multiple tags for each of these there could be CLOCK_ENABLE, CLOCK_GET_MIN_RATE etc. that had a unique clock identifier as part of the value. I also need to know which clocks are parents of the other clocks (so that any power saving disabling of clocks doesn't have unintended side effects).

See include/linux/clk-provider.h for what the clock interface expects. The enable/disable functions can't sleep... I can implement polled mode mbox calls but it'd need to have a response in the mailbox as soon as I've written the request.

How much of the mbox/bell/property interface is generic to all BCM2835 and how much of it is Raspberry Pi specific?

rosery commented Jun 17, 2012

Definitely the right sort of thing. Could you add SetProperty for

framebuffer_ignore_alpha=1
framebuffer_align=0x100000
hvs_swap_red_blue=1

then there would be no need to know about what is in kernel .img beforehand?
Thanks
John

lp0 commented Jun 17, 2012

I'd prefer not to have any access to config.txt at all because this information could be out of date. For example, we could have changed the clock rates since then (and forgotten that we did so because this might be a new kernel executed with kexec). You'd end up having to define two ways to access the same data.

Contributor

popcornmix commented Jun 17, 2012

@lp0

For command line, serial number and board revision I think these would arrive too late in the boot process if the mbox driver had to be used to get them. The command line and memory size/reserved memory must be passed through the device tree file. As the MAC address is static it can go in the device tree file too.

Yes, you will still get the device tree options. It's useful to have an alternative mechanism for non-linux/devtree users.

See include/linux/clk-provider.h for what the clock interface expects. The enable/disable functions can't sleep... I can implement polled mode mbox calls but it'd need to have a response in the mailbox as soon as I've written the request.

Well you'll have to poll for GPU to respond. I'd imagine that was typically in the tens of microseconds range, but could be worse when we are busy (I'd have thought well under 1ms)

How much of the mbox/bell/property interface is generic to all BCM2835 and how much of it is Raspberry Pi specific?
The hardware is common to BCM2835. All BCM2835 platforms use the mboc/bell for some communications (like setting up vchiq). The property scheme hasn't been added to any platform yet, but could be added to all BCM2835 platforms if it proves useful.

lp0 commented Jun 17, 2012

Example get request:
u32 buffer size = 4096
u32 tag = 0x12340001 (get clock state)
u32 length = 4
u32 value = 0x42 (clock id 42)
u32 tag = 0x12340002 (get clock rate)
u32 length = 4
u32 value = 0x42 (clock id 42)
u32 tag = 0x0 (end of data)

Example get response:
u32 buffer size = 4096
u32 tag = 0x12340002 (clock state)
u32 length = 8
u32 value[2] = 0x42, 0x0 (clock id 42, off)
u32 tag = 0x0

Some sort of tag for "error, buffer is too small" would also be appropriate.

Contributor

popcornmix commented Jun 17, 2012

I'd prefer not to have any access to config.txt at all because this information could be out of date. For example, we could have changed the clock rates since then (and forgotten that we did so because this might be a new kernel executed with kexec). You'd end up having to define two ways to access the same data.

Yes, there's less reason for linux to use this, with devtree/ATAGs. However it could be useful for non-linux (e.g. RISCOS) so it seems reasonable to support it in firmware, but not make use of it in linux.

lp0 commented Jun 17, 2012

With this scheme you don't actually need two mbox channels, as the tag id indicates get/set. The response message should probably be the same value (memory location) as the request message so that it can be used asynchronously with multiple buffers.

lp0 commented Jun 17, 2012

Yes, there's less reason for linux to use this, with devtree/ATAGs. However it could be useful for non-linux (e.g. RISCOS) so it seems reasonable to support it in firmware, but not make use of it in linux.

My point is that it gets out of date quickly as config.txt configures the VC. For example, if the OS queries for a config.txt property and it's not there it would need to know what the default is, which could vary based on firmware version. If it always retrieved the current data instead it wouldn't need to do this.

Contributor

popcornmix commented Jun 17, 2012

@lp0

Example get request:
...
Example get response:

Can you confirm that the same buffer is shared for the request and response?
Presumably the success value returned through mbox could indicate buffer too small?

Feel free to specify the data format for any commands you care about.

lp0 commented Jun 17, 2012

Yes, the same buffer would be shared for the request and response.

The mbox values could a status code in the lower bits if we increased the alignment size of the buffer.

Could you enable the wiki for this repository so that a format could be documented?

Contributor

popcornmix commented Jun 17, 2012

Could you enable the wiki for this repository so that a format could be documented?

Should be done.

rosery commented Jun 17, 2012

lp0: access to command line and memory sizes etc within RISCOS via this mechanism would be brilliant.. we do have sufficient control of the boot process that we can request this information in good time to make use of it.

lp0 commented Jun 17, 2012

Here's a first attempt at defining a format: https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface

I've left out the extra frame buffer tags for rosery to define. Depth should probably be replaced with a tag defining the whole fb_info format.

I don't know if this is going to be too verbose or not, but there's plenty of memory and a 4K page could let you get 200+ properties at once.

Update: I've specified that the value is padded to make the next tag 32-bit aligned.

rosery commented Jun 17, 2012

@lp0 I have added tags for the 2 most critical RISCOS things.(I hope I've done it correctly).

Were you proposing that the current framebuffer request channel be redundant? if so, we probably would benefit from a tag that gave and received all the frame buffer elements in 1, including the bytes per line, which can be significantly greater than the pixels per line. (this is currently in the standard fb channel request.

Secondly, checking my understanding, in channel 8, the buffer used is presumably always within arm space, provided by the arm..

lp0 commented Jun 17, 2012

Were you proposing that the current framebuffer request channel be redundant?

Yes, it already can't be extended without breaking compatibility with existing kernels.

if so, we probably would benefit from a tag that gave and received all the frame buffer elements in 1

The request can include multiple tags that would get processed together.

Secondly, checking my understanding, in channel 8, the buffer used is presumably always within arm space, provided by the arm..

Yes

rosery commented Jun 17, 2012

On 17/06/2012 16:02, Simon Arlott wrote:

Were you proposing that the current framebuffer request channel be redundant?
Yes, it already can't be extended without breaking compatibility with existing kernels.

OK.. there are a couple more tags we need to add, then, to make sure we
have covered the functionality that currently exists. I'll give that a
go this evening.

John

John Ballance C.Eng MIET - jwb@rosery.net - 07976 295923

if so, we probably would benefit from a tag that gave and received all the frame buffer elements in 1
The request can include multiple tags that would get processed together.

Secondly, checking my understanding, in channel 8, the buffer used is presumably always within arm space, provided by the arm..
Yes


Reply to this email directly or view it on GitHub:
#47 (comment)

Contributor

popcornmix commented Jun 17, 2012

get timing has a response length of 8, but only u32 is described
set power state has a request length of 4, but two u32 values
get buffer has a response length of 8, but three u32 values
set buffer has a request/response length of 8, but three u32 values
framebuffer needs a pitch parameter (GPU requires padding for non-aligned widths)
Does framebuffer need virtual width/height for panning?

lp0 commented Jun 17, 2012

I've added all the other values that are currently used and fixed the sizes.

I've also added test versions of the setter properties so that new display settings can be proposed before setting them, and removed the set buffer option as it should be unnecessary if the settings are tested first.

lp0 commented Jun 17, 2012

Added a tag to release the buffer and included the alignment in the get buffer request.

Contributor

popcornmix commented Jun 17, 2012

I'm not overly keen on the framebuffer tag interface.
All the other tags I can act on and fill in the response buffer as I go.
I would prefer a single tag containing all the framebuffer properties. If you want to just modify a subset of that, you can do a read/modify/write.
There's a bit of uncertainty if a list of tags contains the same tag twice. For non-framebuffer tags, I would just act on it twice and respond twice. For framebuffer tags, I guess I will ignore the first one.
I assume you don't mind about the order of tags returned? Might be easiest to just set a bit for each framebuffer tag seen, and respond to them in fixed order at the end.
I assume if I don't like any of the framebuffer tags, then I will ignore them all? E.g. if I'm in 640x480x32bpp and you ask for 4096x1536x16bpp, then I wouldn't put you in 640x480x16bpp mode, because that was the only tag I liked?

lp0 commented Jun 17, 2012

I agree that the other tags are easier to work with but the frame buffer is more complex. A single tag becomes limiting when it needs to be extended.

Yes, any order of response is ok - it'll just scan through them to get the ones it wants. The same tag shouldn't occur twice in the request so the behaviour in that scenario would be undefined. The same tag twice in the response also makes the result undefined. Using the last value in each case would be the likely outcome.

Yes, in Set mode it would refuse to change anything. In Test mode it should try to correct the values to the nearest limit or not return them if there's nothing it can do to fix it.

rosery commented Jun 17, 2012

@popcornmix @lp0 Doing the whole frame buffer in 1 tag would fit my needs well

What I would do, probably, is set/specify framealignment, PixelOrder and Alphamode individually, as these would need doing just once (in a while), whereas all the elements of framebuffer would need altering in unison for a screen mode change. Read/Modify/Write is fine..

lp0 commented Jun 17, 2012

I've modified the clock tags to specify that it always returns a tag even if the clock doesn't exist so that it's easier to process the response if someone does request the same clock rate is set multiple times...

lp0 commented Jun 17, 2012

What I would do, probably, is set/specify framealignment, PixelOrder and Alphamode individually, as these would need doing just once (in a while)

But now you've got some parts of it that get set once and some that get set every time you make a change and the result if you change both of them at the same time isn't defined (can one of those changes fail but the other succed?).

Contributor

popcornmix commented Jun 17, 2012

Can I assume you won't mix test/get/set calls in a single message?
Will get buffer wait for other framebuffer set calls to be processed.
(Actually can you list the tags you would expect to be processed at end)

Suppose you have a framebuffer in use that you have previously called get buffer on.
Now you call set width/height to a larger size. Will that fail until a release buffer is called?

lp0 commented Jun 17, 2012

Can I assume you won't mix test/get/set calls in a single message?

Yes, mixing test and set is invalid. Test and get would be ok in theory but I can see that being difficult to process.

Will get buffer wait for other framebuffer set calls to be processed.

Yes, I'd expect get to return after all the set operation is finished.

Suppose you have a framebuffer in use that you have previously called get buffer on.
Now you call set width/height to a larger size. Will that fail until a release buffer is called?

If you don't include a get buffer tag then the buffer is not allowed to change base or size so the request would fail. If a get buffer tag is included then the new buffer is returned when the change is successful.

I've renamed "get buffer" to "allocate buffer".

Contributor

popcornmix commented Jun 17, 2012

Okay, so a sequence of sets and a get buffer implicitly frees the old buffer.
You don't have to explicitly call release buffer except when the framebuffer is destroyed.

lp0 commented Jun 17, 2012

Correct - I'll add that to the page

lp0 commented Jun 17, 2012

MEMORY_SIZE

I know you want to make it easier for OSes that don't want to parse DT or ATAGs but isn't there a problem with using memory to pass buffers around before you know how much memory you have?

If issue #45 gets implemented then additional memory can be allocated to the VC in suitable chunks.

@rosery, what does RISCOS do to determine available memory?

I've added tags for this anyway...

rosery commented Jun 17, 2012

@lp0 In RISCOS what happens is (roughly):
HAL + kernel starts at physical address 0 - this is our kernel.img
some minimal code is run in the HAL to gain control of the ARM chip.
it will then interact with the mailbox, before the mmu is enabled, to inquire the mem size.
(and get an initial frame buffer..). a few other hardware related things are also initialised.

This kernel is then invoked, with parameters that include the location of the 'kernel.img' and the amount of ram available. It is only at this stage that general memory map is constructed.

The ram size was initially taken from ATAGs, but this mailbox method would suit considerably better.

lp0 commented Jun 17, 2012

Ok. I've added a get clocks property too, so that we know the dependencies between the clocks.

rosery commented Jun 17, 2012

Frame buffer stuff is Looking good. A question: allocating a frame buffer requires a min of set width/height/depth. Will the other tags get returned, even if not requested?.. e.g. pitch is (almost always) implicitly set from the 'fit' of the request into the screen attached.. etc

lp0 commented Jun 17, 2012

Tags won't be returned unless requested. I'll remove the Test/Set versions of pitch. I don't want to merge it into "allocate buffer" as that complicates it by requiring that the pitch also doesn't change if the existing buffer is being reused.

swarren commented Jun 18, 2012

Since lp0 wrote the wiki now popcornmix, I assume this is some completely new API that isn't defined/implemented yet (which is slightly at odds to popcornmix having said "We have a generic message passing structure" rather than "We could/can have ...")

This sounds like a reasonable approach in general.

The buffer alignment to 256 bytes isn't hard to achieve, but potentially does waste a bit of memory. Can we make the buffers 16-byte aligned and use the entire 28-bit message portion of the mailbox to contain the buffer address? We can simply move the request/response code to be the second u32 in the buffer. The only issue here would be the lack of response code if the memory address was invalid, but that seems acceptable to me.

I agree with lp0's comments that all APIs should reflect current state in general, so that the responses are always "accurate" at current time. Of course, we can always define some messages that specifically request initial state or config.txt content if there's a use for that, although it does seem slightly unlikely.

I would suggest that for each tag, we take the max of request and response data size, and allocate that much space in the buffer in the request even if it isn't needed. That way, when the VC is generating the response, it won't need to re-write the tag IDs and move them about based on request/response size differences, but rather it will just fill the data into the pre-existing space after the tag. Perhaps each tag should also contain an individual allocated length field for error-checking purposes and extensibility - something like:

Overall format:
u32 overall buffer length
u32 request/response code

Then a list of tags, each:
u32 tag buffer length (this length field's length plus tag ID field's length plus length of request/response data area following)
u32 tag ID
u8[tag buffer length - 8] request/response data

For variable-length response (e.g. a request like get command line), the tag-specific response format could be :
u32 needed length
null-terminated string

That way, if the request's tag buffer length (minus 8) isn't large enough for the response data, the VC can fill in what it can, but still indicate the length needed to return the entire string, and then the ARM-side can send a larger buffer next time.

I haven't actually reviewed the specific messages in the wiki in detail yet.

swarren commented Jun 18, 2012

I put some comments inline in the wiki. I hope that's a reasonable place - easier to see which parts I'm talking about that way.

Re:

See include/linux/clk-provider.h for what the clock interface expects. The enable/disable functions can't sleep... I
can implement polled mode mbox calls but it'd need to have a response in the mailbox as soon as I've written
the request.

Even though the Linux clock API enable/disable functions can't sleep, prepare/unprepare can. I believe the idea is that if you can enable/disable your clocks without blocking/scheduling, you implement enable/disable the clocks inside the enable/disable functions, but if you need to block/schedule, you enable/disable the clocks inside the prepare/unprepare functions. BCM2835-specific drivers might be aware of which of prepare or enable is going to actually enable the clocks, and call them at the appropriate times, e.g. calling prepare later rather than earlier if possible. And I think the differentiation is whether enable has to block/schedule more than the amount of time it takes, so you could always poll in enable for a "long" time if you have to, although IIRC this ends up holding clock locks while doing it, so isn't the best idea.

lp0 commented Jun 18, 2012

I would suggest that for each tag, we take the max of request and response data size, and allocate that much space in the buffer in the request even if it isn't needed.

The caller may not know the size of the response in advance if it gets increased to provide more information. You first suggest 256 bytes is too much and then to leave the request unmodified, which will also use more memory?

That way, if the request's tag buffer length (minus 8) isn't large enough for the response data, the VC can fill in what it can, but still indicate the length needed to return the entire string, and then the ARM-side can send a larger buffer next time.

I wouldn't want to have to do that in the kernel... it'd need to be able to re-arrange all the tags to accommodate every one whose response was now too big.

Even though the Linux clock API enable/disable functions can't sleep, prepare/unprepare can.

The description also implies that prepare is too early to enable them. I was planning on polling until it completes.

swarren commented Jun 19, 2012

Re: tag re-allocation: I'd actually expect any message to contain only 1 tag at a time, so there wouldn't be any need for the kernel to re-arrange tags if the first attempt didn't have a large enough buffer. Still, that would also imply that the VC wouldn't have to re-arrange tags in order to generate the responses, which ends up making my suggestion less useful anyway.

Re: prepare API: I believe the whole reason for the prepare API's existence (it was only fairly recently added IIRC) is so that prepare can enable the clocks if the code needed to enable the clock can't be implemented in enable for some reason, such as blocking. Still, if polling won't take too long, that's probably acceptable too.

swarren commented Jun 19, 2012

Oh, re: "You first suggest 256 bytes is too much and then to leave the request unmodified, which will also use more memory?"

I didn't suggest leaving the request unmodified, but rather leave all the tag and length fields in the same place, and just replace the request data with response data. That way, you don't have to move all the tags about when generating the response. And the size of the request/response itself wasn't the issue I wanted fixed, but the need to align the /start/ of the packet to a 256-byte boundary.

rosery commented Jun 19, 2012

@swarren - the concept of padding the datalength of the request tag to the same as the response tag, for most tags, makes excellent sense. It means that the responder does not need to do any allocation or tag shifting, instead all it has to do is insert responses in the spaces provided. The only sort of tag that may need additional treatment is a tag of unknown response length, such as reading a commandline string. In this case, the behaviour should perhaps be to return what can fit in the space provided, with one field reporting the total length needed for a full reply

adammw commented Jun 19, 2012

sorry to step on your toes while still designing, but will these apis/channels allow changing of the overscan (and similar config.txt parameters) while the device is running?

Contributor

popcornmix commented Jun 19, 2012

@adammw
No. While we can add tags in to describe scaling of the framebuffer, that wouldn't help a linux user as there is no plumbing that will make this get called. If you have further comments on this pleasse create a new issue.

lp0 commented Jun 19, 2012

Those could be added to the framebuffer driver as sysfs parameters

Contributor

popcornmix commented Jun 19, 2012

Okay we should add tags to describe this. We already have source offset_x, offset_y, width and height.
We need the same four parameters in destination space.

lp0 commented Jun 19, 2012

I've updated the format to respond in-place.

lp0 commented Jun 19, 2012

Can you support overscan top/bottom/left/right parameters and change them without moving the buffer base address?

Contributor

popcornmix commented Jun 19, 2012

Yes.

Contributor

popcornmix commented Jul 1, 2012

I've updated the format to respond in-place.

This isn't clear to me. So the tags now remain in place for the response?
is the:

  • u32: value buffer size in bytes

the maximum of request length and response length?

Do you want ARM and VC memory addresses as bus or physcal addressses?

lp0 commented Jul 1, 2012

It's the maximum size of the request/response length, the value after that indicates how much is actually used and if it's a request/response. Yes, they now remain in place.

Be careful how you handle the kernel being newer/older than the VC that provide a buffer that's smaller/larger than you expect for the response.

I'd prefer the ARM physical addresses as used by the existing framebuffer interface.

Contributor

popcornmix commented Jul 1, 2012

  • I'd prefer the ARM physical addresses as used by the existing framebuffer interface.

I believe the framebuffer interface uses bus addresses (i.e. GPU addresses like 0x4xxxxxxx).
From my point of view a physical address is before the VideoCore MMU, e.g.
0x00000000 for RAM
0x20000000 for peripherals

A bus address is after the VideoCore MMU. e.g.
0x40000000 for RAM
0x7e000000 for peripherals

Contributor

popcornmix commented Jul 1, 2012

With set power state

  • Bit 0: 0=off (failed), 1=on (ok)

This seems ambiguous. If I just turned it off (successfully) do I return 0=off (failed) or 1=on (okay) ?

lp0 commented Jul 1, 2012

I've removed the text in brackets - it's the new state of the device, which would be unchanged if it failed.

Contributor

popcornmix commented Jul 1, 2012

I think we also need a palette option get/set/test for 8bpp modes.

  • u32 First entry
  • u32 Number of entries
  • n * u32 palltte entries in RGBA format.

lp0 commented Jul 1, 2012

I believe the framebuffer interface uses bus addresses (i.e. GPU addresses like 0x4xxxxxxx).

Ok

I think we also need a palette option get/set/test for 8bpp modes.

I've added this to the list

swarren commented Jul 3, 2012

It'd be much better if all APIs just used the physical addresses. If the kernel wants to interpret them (physical addresses), it knows how the ARM's MMU is configured, and can convert them to ARM-side virtual addresses. If the VC firmware wants to interpret them (physical addresses), it knows how the VC's MMU is configured, and can convert them to VC-side virtual addresses. Requiring software running on the ARM to know how the binary blob running on the VC configured the VC's MMU seems like far too much knowledge, and a layering violation.

Contributor

popcornmix commented Jul 6, 2012

I'm happy to change this API to use physical addresses.
I've implemented most of the code described here. I've done some sanity tests, but otherwise it is not well tested. I'll release a version for testing this weekend.

Couple more features that could be useful:
We can support transforms of display, which can be useful for portrait displays. You can see the 8 orientations in /opt/vc/include/interface/vmcs_host/vc_dispservice_x_defs.h.

We can switch display from HDMI to composite, and choose HDMI and composite modes. There are quite a lot of options, but basically the options in http://elinux.org/RPi_config.txt relating to sdtv and hdmi could be done at runtime.

Perhaps a set output composite message with parameters:
use default (NTSC, possibly overridden by config.txt)
or
sdtv_mode, sdtv_aspect
and a set output hdmi with parameters:
use default (negotiated with TV, possibly overridden by config.txt)
or
hdmi_group, hdmi_mode, hdmi_drive

rosery commented Jul 7, 2012

looking forward to that test version. many thanks John

Contributor

popcornmix commented Jul 8, 2012

You can try this:
https://dl.dropbox.com/u/3669512/temp/start.elf

I think something is implemented for all commands.
The clock messages aren't complete, although you should be able to get/set the frequency.
I've made a decent stab at framebuffer messages, but as it's quite complicated there's likely to be bug/misunderstandings.
I'll be working on this today, so there may be a better version tonight, but if you spot anything wrong, then let me know.

[edit]
I've updated the linked start.elf. I believe everything in spec is now implemented. Not everything is tested...

Contributor

popcornmix commented Jul 8, 2012

Another suggestion. A display driver should have the ability to block until next vsync.
Can we add a message that is known to respond after the next vsync.
See #67

lp0 commented Jul 9, 2012

As long as it will still handle other accesses asynchronously that shouldn't be a problem. The VideoCore should keep processing other requests and keep the response to the message with a vsync tag in a separate task.

Contributor

popcornmix commented Jul 9, 2012

Would a real ISR be better?
While the ARM can't handle the real VSYNC ISR, the GPU can "force" an interrupt during its ISR, that the ARM sees.
Is that more convenient?

That makes a lot more sense IMHO. The message we need is then simply to turn these ISR's on/off.

rosery commented Jul 9, 2012

sounds good to me too.. an irq to the arm can be acted on or ignored at the arm's discretion.. thanks

Are we agreed that we need a message to turn these on/off? Having an interrupt 60 times a second that we may never use sounds very expensive. Shall I add it to the spec?

In a similar vein, the set power state message with the wait option is a bit of an anomaly too. What would happen if this tag was combined with other tags? Would they be processed immediately if they came before that tag but not until the device was ready if they came after? If all the VC is doing is waiting the number of microseconds specified by get wait time then it seems unnecessary functionality, if it is doing something more then perhaps this would be better serviced by use of a specific interrupt too.

Contributor

popcornmix commented Jul 10, 2012

I don't actually believe the power waiting takes any measurable time.
I think it is microseconds, and is lost in the noise when waiting for the interrupts/context switches of ARM and GPU in just handling the message.

Also interrupts can be masked, so a separate disable message is not essential.

Thanks for the response and sorry for forgetting about interrupt masking, it's been a while since I've done o/s coding and it's coming back slowly!
I thought I'd seen a reference somewhere on t'web to an interrupt (rather than busy wait) way of checking for responses in the mailbox but can't find it now. In the light of your first paragraph I'm guessing this isn't neccesary or available; is that right? Is there a specification somewhere for the "right" way to write/read mailboxes. I wrote my code from putting together stuff posted in various places by various people which they got from looking at various bits of the linux source but I'm still not sure I've got the right memory barriers in the right places. Is anyone up for creating (or correcting if I make a start on) a page on the firmware wiki to give the authoritative view on this (preferably including the assembly instructions for each type of barrier)?

Sorry for the long post, but I thought I’d get back on topic and post a few questions/requests for clarification about the new message specification document:

The callee is not allowed to return a different buffer address, this allows the caller to make independent asynchronous requests.

Surely rather than allowing asynchronous requests the justification for this should be that the VC can't allocate memory in the ARM memory space and there's no way for the ARM to free memory in the VC memory space?

All u64/u32/u16 values are in host CPU endian order

Have I misunderstood this or is the VC able to detect in which endian mode the ARM is operating?

Response may include unsolicited tags

Are there currently any examples where this might happen?

0x80000001: error parsing request buffer (partial response)

Does “partial response” mean that any tag with the response bit set has been processed and any tag without hasn't? Because that's surely the assumption anyway, we don't need a value to tell us whether every tag has its response bit set—that’s duplicating information. If it means that the whole structure couldn't be parsed then that makes more sense, e.g. if following the tag chain goes beyond the total buffer length or the VC is checking that the total buffer length doesn't extend into memory that doesn't belong to the ARM. If that's what this means then perhaps we could explicitly state that. If it isn't then is this whole request/response field unnecessary?

u8...: padding

Am I right in thinking that there is alignment padding at the end of each tag and therefore padding is not required at the end of the whole buffer? Equivalently, does the response to each tag have to fit in the buffer for the request for that tag or can tags be rearranged by the callee within the total buffer? If the latter then we could avoid the need to specify buffer size for each tag, assuming it to be the length required by the value rounded up to the nearest 32 bits, if the former then padding at the end of the total buffer will never be used.

Future formats may specify multiple base+size combinations

Does it make sense just to specify this now by saying that the response length is a multiple of 8 and the response value can be repeated?

Caller should not assume the string is null terminated

Would it be better to specify that the string is not null terminated to avoid the caller having to check for and ignore null characters (or conversely to allow null characters to act as separators within the command line if desired)?

Caller assumes that the VC has enabled all the usable DMA channels

Can I understand this to mean that it is safe for the caller to make this assumption? I.e. is it true that the VC will always have done this?

it doesn't seem to make sense for the request to have a "clock exists" bit, only the response

This seems like a fair comment, should this change be made to the spec?

Blank screen

Does this try to put the screen into power-saving mode? Either way should this information be specified here?

Test palette

In what circumstances might this test fail (other than a malformed request)? Similarly, what does the response from Set palette indicate?

Apologies again for the long post, feel free to respond just to the bits that you feel need a response.

Further to my previous message, I've created pages on the wiki to describe mailboxes and their access. If anyone wants to fill in the blanks it should be relatively easy for you now.

rosery commented Jul 15, 2012

Hi. I've been testing this interface and have run into a bit of a problem.. probably my understanding.

In the very early stages of boot, I wish to capture useful things such as mac address, memory sizes, etc.. This is in machine code, before the MMU is turned on. I have issued a message (get mac address btw). The message channel completes, but the arm side tag buffer is not (apparently) filled in by the VC side. Once the boot has been fully completed I see that actually the buffer was filled in (and can see the mac address there).

I am giving it a valid arm side physical buffer address of 0x1000. Presumably there is something I need to do to oblige the arm and vc sides to synchronise. (I have tried the arm side mem barrier codes using CP15 with c7,c10,4 or 5.)

Can anyone shed light on this? .. is it waiting for the vc side to flush its cache? is it likely to be that the arm side data cache is ignoring the fact that the ram content was changed by the vc side?

rosery commented Jul 15, 2012

Additionally, the get arm memory returns 0 and the get vc memory returns (either the correct value or all the memory .. I cannot recall) . Is this the expected behaviour at this stage? .. using the start.elf from dropbox above

Thanks

rosery commented Jul 15, 2012

Thanks BTW for all the work folks have put into the WIKI pages for this. They have been a considerable help

swarren commented Jul 16, 2012

Re: the responses apparently not being filled in: is the cache on? If so, you need cache flushes and invalidates at the appropriate times

rosery commented Jul 16, 2012

@swarren Thanks.. for the avoidance of doubt I put a flush data cache instruction (MCR CP15,0,R3,C7,C6 with R3=0) before I wrote to the message channel, after the channel had reported message completed, AND just before attempting to read the tag buffer.

The read from the buffer reported 0 (the initial state). Once booting had completed the buffer contents were correct (0x80000006). (the 3rd word of the tag buffer for read mac address)

@popcornmix Is an equivalent cache flush required within the VC side?.. I'm more than happy to send this code if that helps..

or have I completely misunderstood teh message timing.. I understand that once the mailbox has replied, then the tag buffer should be updated?

Thanks

Contributor

popcornmix commented Jul 16, 2012

No need to flush VC side.
Can you confirm what the address you are writing through the mailbox?
Currently it needs to be a bus address (i.e 0x4xxxxxx when L2 is enabled or 0xCxxxxxxx when L2 disabled).

@rosery Are you saying that the buffer is filled in after booting without sending an additional request? Because that wouldn't fit with writing to the wrong register address and does sound more like caching as you suggest.

The instruction you used was invalidate data cache, do you get better results using either Clean and Invalidate or Data Memory Barrier? (the relevant instructions are summarised on https://github.com/raspberrypi/firmware/wiki/Accessing-mailboxes). The advantage of the latter is that it can also be used from user code.

swarren commented Jul 16, 2012

You need to:

  1. Write to the message buffer
  2. Flush the cache (possibly need a barrier before that)
  3. Do the mailbox I/O
  4. Invalidate the cache
  5. Read the message buffer (possibly need a barrier after that)

rosery commented Jul 16, 2012

Thanks

I'd missed the update with the barrier instructions included..

Stop press! Just announced - Price Promise

Broadband from £2.99/month, no tie in

The Utility Warehouse Price Promise means you can benefit from: The UK’s

cheapest Home Phone

The UK’s cheapest Home Phone and Broadband bundle

The UK’s cheapest Mobile tariffs

The UK’s cheapest standard Gas and Electricity

Or they will give you back Double the Difference!

Charges, terms and conditions apply. For full details of
the Utility Warehouse Price Promise see http://www.ucallusave.co.uk
http://www.ucallusave.co.uk

Intrigued? Call me

John Ballance C.Eng MIET - jwb@rosery.net - 07976 295923

On 16/07/2012 15:18, Nathan Phillips wrote:

The instruction you used was invalidate data cache, do you get better results using either Clean and Invalidate or Data Memory Barrier? (the relevant instructions are summarised on https://github.com/raspberrypi/firmware/wiki/Accessing-mailboxes). The advantage of the latter is that it can also be used from user code.


Reply to this email directly or view it on GitHub:
#47 (comment)

No no, I only put them there today in response to your issue - thought it more useful than just putting a reply here. It would be great if someone who understood this properly could tidy up the wiki page to say exactly what is needed though, everyone keeps saying "possibly needed" and using more than they need "just in case".

rosery commented Jul 16, 2012

@popcornmix Thanks.. mea culpa.. it was the L2 cacheing. us 0x40000000 range addresses and the reply is as expected when expected/

@NathanJPhillips this L2 address cacheing needs to be in the wiki.. will you do it, or shall I?

Thanks

John

Contributor

popcornmix commented Jul 16, 2012

We need a decision on the type of addresses to use.
swarren suggested using the ARM physical address (so without the 0x4 or 0xC alias) which I agree makes sense.
lp0 - do you agree?

rosery commented Jul 16, 2012

@popcornmix It would most sense if the ARM side didn't need to know the VC side addressing mode, so I would strongly support using the arm physical addresses. (top 2 bits 0x0)

Also, please, when do you expect this system (however far it has got) to make it into the standard range of start.elfs on github?

thanks

John

rosery commented Jul 16, 2012

I'd add.. If the ARM side does need to know the VC side cache mode, could we please have a reliable way to determine what is required at run time?

Contributor

popcornmix commented Jul 16, 2012

I've been holding off pushing this out until the new official image has gone out (there's been quite a rewrite of the framebuffer code even when using the old mailbox channel).

However the new official image is uploaded as we speak, so next firmware I push out will include this.

rosery commented Jul 16, 2012

That is great news.

Incidentally does it correctly report how much ram is available to the ARM? .. the dropbox version 'gave it all' to the VC

Thanks

John

@rosery I've added a section on memory addresses to https://github.com/raspberrypi/firmware/wiki/Accessing-mailboxes. Is my understanding correct as far as you can tell? Also, are memory mapped registers, such as the address of the mailbox itself, always based at 0x20000000 irrespective of L2 caching?
@popcornmix Would the change to use ARM physical addresses just affect this interface or would it be applied to the framebuffer interface too? Would it affect the addresses returned for framebuffers in this interface and/or the framebuffer interface as well as the addresses passed to the mailbox for buffers? Also you mentioned the framebuffer interface is changing. I made a start at documenting it here: https://github.com/raspberrypi/firmware/wiki/Mailbox-framebuffer-interface, can you refer me to a description of the changes so I can update the wiki?

Also, if you're getting near pushing this then it would be good to deal with the two points from my overly-long message above that might affect the specification of the message: firstly, does the buffer request/response code provide anything not already given by the combination of the individual tag request/response indicators and if not should it be dropped? Secondly, what failures will you be reporting in the set pallette response other than malformed requests (already reported with the request/response code) and if none should it be dropped?

rosery commented Jul 16, 2012

Hi Nathan. That reflects my (now) understanding of addressing used in
mailboxes. If @popcornmix makes the change to all addressing being arm
physical (0x0 for top 2 bits) this becomes irrelevant. Really, the arm
side should not need to know how the vc side is mapped!!

I'll feed back on what barriers etc I find needed in due course..

Stop press! Just announced - Price Promise

Broadband from £2.99/month, no tie in

Intrigued? Call me

John Ballance C.Eng MIET - jwb@rosery.net - 07976 295923

On 16/07/2012 23:48, Nathan Phillips wrote:

@rosery I've added a section on memory addresses to https://github.com/raspberrypi/firmware/wiki/Accessing-mailboxes. Is my understanding correct as far as you can tell? Also, are memory mapped registers, such as the address of the mailbox itself, always based at 0x20000000 irrespective of L2 caching?
@popcornmix Would the change to use ARM physical addresses just affect this interface or would it be applied to the framebuffer interface too? Would it affect the addresses returned for framebuffers in this interface and/or the framebuffer interface as well as the addresses passed to the mailbox for buffers? Also you mentioned the framebuffer interface is changing. I made a start at documenting it here: https://github.com/raspberrypi/firmware/wiki/Mailbox-framebuffer-interface, can you refer me to a description of the changes so I can update the wiki?


Reply to this email directly or view it on GitHub:
#47 (comment)

Contributor

popcornmix commented Jul 17, 2012

Would the change to use ARM physical addresses just affect this interface or would it be applied to the framebuffer interface too?

Just this interface. The framebuffer channel will eventually be deprecated.

Would it affect the addresses returned for framebuffers in this interface and/or the framebuffer interface as well as the addresses passed to the mailbox for buffers?

The suggestion is to use ARM physical addresses for the property channel (which eventually will be the only channel). It will affect all addresses through this channel, which is currently the property structure address passed through the mailbox, and the framebuffer base address.

Also you mentioned the framebuffer interface is changing. I made a start at documenting it here: https://github.com/raspberrypi/firmware/wiki/Mailbox-framebuffer-interface, can you refer me to a description of the changes so I can update the wiki?

Nope, the old interface is not changing. There is just a new interface that will eventually replace it.

Contributor

popcornmix commented Jul 17, 2012

Incidentally does it correctly report how much ram is available to the ARM? .. the dropbox version 'gave it all' to the VC

Looks like:
GET_ARM_MEMORY returns (0, 128M)
GET_VC_MEMORY returns (128M, 128M)
which seems right to me (this build is a 128M/128M split).

@popcornmix popcornmix added a commit that referenced this issue Jul 17, 2012

@popcornmix popcornmix Add sync_after_dma module parameter to kernel.
Tweak for hdmi mode selection when no detailed timing.
Add hdmi_ignore_hotplug, disable_l2cache_witealloc  and arm_control configs.
Add mailbox property channel, see github issue #47
43b688a

rosery commented Jul 17, 2012

@popcornmix agreed, with what you released today, the memory reports OK.. also the mac address and board serial numbers.

Should I be able to allocate a frame buffer yet through this message channel? At present I'm getting some rather unexpected responses.
Additionally, if I understand it correctly, this is what I put as a tag for requesting a buffer megabyte aligned
0x00040001
0x00000008
0x00000004 *****
0x00100000
0x00000000

the reply I get is:
0x00040001
0x80000008
0x00000004
0x00110000
0x00000000

***** as you only send it 1 word, the alignment, presumably this ought to be 4

I got funny replies on the resolution setting too .. it returned identical x and y resolutions. (a subsequent buffer set through channel 1 did leave me a working buffer)

Many thanks.. John

Can you post the entire contents of the buffer you are sending to the mailbox? Just want to check that the buffer headers are correct so that it can interpret the tag correctly in that context.

rosery commented Jul 18, 2012

Hi

dumbly.. this is the assembler used .. DCD defines a 4 byte word aligned word .. i.e u32 .. the tag values are correct.

OH.. and BTW this now works correctly using the ARM physical addressing. Comments in the wiki can now specify ARM physical addressing for all buffers passed to this channel (0x000000 to 0x1f000000)

tagb
DCD tagslen ; total length of this, including this word
DCD 0
tagmac
DCD ARM2VC_Tag_GetBoardMAC
DCD 8 ; tag buffer length
DCD 0 ; length of sent tag data in buffer
DCD 0
DCD 0
tagserial
DCD ARM2VC_Tag_GetBoardSerial
DCD 8
DCD 0
DCD 0
DCD 0
tagarmmem
DCD ARM2VC_Tag_GetARMMemory
DCD 8
DCD 0
DCD 0
DCD 0
tagvcmem
DCD ARM2VC_Tag_GetVCMemory
DCD 8
DCD 0
DCD 0
DCD 0
tagdisplphyswh
DCD ARM2VC_Tag_FBSetPhysDimension
DCD 8
DCD 8
DCD 1920
DCD 1080
tagdisplvirtwh
DCD ARM2VC_Tag_FBSetVirtDimension
DCD 8
DCD 8
DCD 1920
DCD 1080
tagdisplvirtoffset
DCD ARM2VC_Tag_FBSetVirtOffset
DCD 8
DCD 8
DCD 0
DCD 0
tagdispldepth
DCD ARM2VC_Tag_FBSetDepth
DCD 4
DCD 4
DCD 32 ; 32bit
tagdisplpixord
DCD ARM2VC_Tag_FBSetPixelOrder
DCD 4
DCD 4
DCD 1 ; RGB
tagdisplalpha
DCD ARM2VC_Tag_FBSetAlphaMode
DCD 4
DCD 4
DCD 2 ; alpha channel ignore
tagdisplalloc
DCD ARM2VC_Tag_FBAlloc
DCD 8
DCD 8
DCD 0x100000 ; megabyte aligned
DCD 0
DCD ARM2VC_Tag_End
tagslen * . - tagb

Contributor

popcornmix commented Jul 18, 2012

I think your FBalloc length should be 4. Otherwise your tags seemed to be parsed correctly.
I fixed a bug where I had an xres/yres swap (pushed to github).
It seemed to set the mode okay. If you think the resonse is wrong, can you point out the words you don't like for the above message?

It looks like rosary was initially sending a length of 4 (from his previous post) when he got the response he quoted, which looks wrong as follows:
0x00040001 - correct
0x80000008 - should be 0x00000008
0x00000004 - should be 0x80000008
0x00110000 - perhaps correct if failed because of x-res/y-res swap and this was the previous buffer address
0x00000000 - correct if failed

rosery commented Jul 19, 2012

OK..
Latest build (18 July) has helped some
Hex dump of what is sent, and what is now received (the tag list is as previously posted):
*mem.fc005bc0 (sent)

Address : 3 2 1 0 7 6 5 4 B A 9 8 F E D C 3 2 1 0 7 6 5 4 B A 9 8 F E D C
FC005BC0 : 000000DC 00000000 00010003 00000008 00000000 00000000 00000000 00010004
FC005BE0 : 00000008 00000000 00000000 00000000 00010005 00000008 00000000 00000000
FC005C00 : 00000000 00010006 00000008 00000000 00000000 00000000 00048003 00000008
FC005C20 : 00000008 00000780 00000438 00048004 00000008 00000008 00000780 00000438
FC005C40 : 00048009 00000008 00000008 00000000 00000000 00048005 00000004 00000004
FC005C60 : 00000020 00048006 00000004 00000004 00000001 00048007 00000004 00000004
FC005C80 : 00000002 00040001 00000008 00000008 00100000 00000000 00000000
*mem.fc001050 (responded)

Address : 3 2 1 0 7 6 5 4 B A 9 8 F E D C 3 2 1 0 7 6 5 4 B A 9 8 F E D C
FC001050 : 000000DC 80000000 00010003 00000008 80000006 99EB27B8 0000BC53 00010004
FC001070 : 00000008 80000008 389953BC 00000000 00010005 00000008 80000008 00000000
FC001090 : 08000000 00010006 00000008 80000008 08000000 08000000 00048003 00000008
FC0010B0 : 80000008 00000780 00000780 00048004 00000008 80000008 00000780 00000780
FC0010D0 : 00048009 00000008 80000008 00000000 00000000 00048005 00000004 80000004
FC0010F0 : 00000020 00048006 00000004 80000004 00000000 00048007 00000004 80000004
FC001110 : 00000002 00040001 00000008 80000008 007F8000 00000000 00000000
*
from this:
tag 48003 requested 780_438 (hex) reply was 780_780, same for tag 48004

tag 40001 sent alignment of 0x100000 in first 4 bytes. this is all it sends so the count in the preceeding word ought (as I read it) to be 4 (actually it seems that either 4 or 8 works, though surely 4 is the required value?)

tag 40001 received a byte count of 8 - correct- with first parameter 7f8000 -which is correct for the size of 780x438x20 (hex)- BUT this should be thr frame buffer start address. Second parameter (word) received is 0, whereas it should receive the contents of the first parameter, 7f8000
(19 july edit) this 7f8000 value is what it has always returned in channel 1 .. it appears wrong .. see below

Nearly there, but fixing this will enable the use of the message channel to actually get frame buffers!!

Many thanks.. John

Whilst I note you don't request the pitch/stride, 780*4 = 1E00, which doesn't look like it needs rounding up. If the stride is 1E00 then 438 lines would be 7E9000, not 7F8000. I'd be more tempted to assume that the reply is following the standard for "If the requested alignment is unsupported..." Or have I made a mistake in my maths?

rosery commented Jul 19, 2012

I didn't request the pitch as at this stage I'm far more interested in gleaning the buffer address .. then I can use it. also in getting the correct response to the resolutions .. that way I would expect the test parameters to work.

As for the math, 7E9000 is mathematically correct .. result of (decimal) 1920 * 1080 * 4.

I checked what was returned in the mailbox channel 1 - frame buffer, and the same error exists. For this it reports a pitch of 0x1e00, as expected, and a frame buffer length of 0x7f8000

I'll put the pitch request in and see what happens .. ... ... it isnt yet implemented .. flagged as not responded to..

I put in 0x00040008 0x4 0x0 0x0. the response buffer has these unchanged .. i.e. no hi bit set in the 3rd word.

@popcornmix this math seems odd.. can you confirm what you expect?

Many thanks to you both
John

Contributor

popcornmix commented Jul 19, 2012

Fixed the framebuffer base/size being written to same word.
Also xres/yres swap in response message:
https://dl.dropbox.com/u/3669512/temp/start_mbox.elf
There is not set pitch message in the spec, so I won't respond to it.
Images generally have width/height rounded up to the next 16.
(although padding is not visible, the memory is allocated)
So I'd expect 1920x1080x32bpp to be 1920x1088x4=0x7f8000.

Not sure why you fill in request length for allocate as 8. The spec says 4.

rosery commented Jul 19, 2012

Hi

John Ballance C.Eng MIET - jwb@rosery.net - 07976 295923
On 19/07/2012 15:14, popcornmix wrote:

Fixed the framebuffer base/size being written to same word.

thanks

Also xres/yres swap in response message:
https://dl.dropbox.com/u/3669512/temp/start_mbox.elf

I take it this is a test image.. will these changes make it out on the
next release?

There is not set pitch message in the spec, so I won't respond to it.

no set_pitch or test_pitch.. agreed, but yes get_pitch .. it was
something you reported in the frame buffer channel ..
will you be doing so in the message channel .. if not, we'll need to
remove it from the spec.

Images generally have width/height rounded up to the next 16.
(although padding is not visible, the memory is allocated)
So I'd expect 1920x1080x32bpp to be 1920x1088x4=0x7f8000.

thanks for the explanation .. does that mean we'll have to specify
something (e.g. overshoot) to get to an accurate 1 pixel per display
pixel alignment?

Not sure why you fill in request length for allocate as 8. The spec says 4.

which is what I expected, though some confusion is around. Again, thanks
for confirming.

Thanks

John


Reply to this email directly or view it on GitHub:
#47 (comment)

rosery commented Jul 19, 2012

@popcornmix Thanks for the dropbox bit above. Yes it returns consistent values for all I've now tried except the get pitch call. Do you propose to implement that? .. needed I believe, to deal with some of the more awkward modes .. or at least permit clients to sort themselves out without specific knowledge of how you allocate in the vc side

Contributor

popcornmix commented Jul 20, 2012

I've updated dropbox file. It might handle get pitch now.

rosery commented Jul 20, 2012

That is brilliant.. pitch info reported correctly.

Do you have an ETA for this to hit github? I'm off to use this method in
mode changes within riscos.. it'ld be rather nice to be able to point
people at the current start.elf releases ..

Many thanks John

Broadband from £2.99/month, no tie in Intrigued? Call me

John Ballance C.Eng MIET - jwb@rosery.net - 07976 295923

On 20/07/2012 20:24, popcornmix wrote:

I've updated dropbox file. It might handle get pitch now.


Reply to this email directly or view it on GitHub:
#47 (comment)

Contributor

popcornmix commented Jul 20, 2012

It's in the main source tree, so will be in next firmware update. Probably this weekend.

Thanks for all your work on this. Really is appreciated.

If anyone could address my questions above I'd be happy to update the wiki to make it more informative/consistent.

rosery commented Jul 22, 2012

@popcornmix . I have now implemented frame buffer fetch using message system. Compared to requesting an identical buffer through the framebuffer channel it looks as tough the VC side is being set up with a different cacheing mode .. I presume the through channel 1 it sets up using 0x40000000 mode ..L2 non aligned(?). Through the messaging channel is the VC side set up identically? . It seems not. The effect seen here is that in a buffer set up through the messaging channel, identical parameters, pixels don't always appear instantly.. there can be visible delay to pixels or partial lines.. . they seem to get there in the end.

Other than that, (and presumably alpha channel 0x1: Alpha channel reversed (0 = fully transparent) not yet implemented) the frame buffer and general information calls are looking good. I haven't done anything with clocks or power yet.

Thanks, John

Contributor

popcornmix commented Jul 22, 2012

It's been pushed to github. There's possibly a fix for cache coherency in framebuffer.

rosery commented Jul 22, 2012

Hi.. Regrettably not fixed. Would it help to send you a kernel (riscos) showing this? no extra disc load is required, just the kernel.img

rosery commented Jul 22, 2012

To clarify: what doesn't behave .. call via the messaging channel (using the tag sequence posted earlier) to get a frame buffer. As time progresses the screen gradually paints. A flashing cursor never shows and a moving mouse pointer just leaves odd specs.
Add a call to the frame buffer channel, using the same parameters, with 0x40000000 in request buffer high address and all is good.

Contributor

popcornmix commented Jul 22, 2012

Yes, send me a kernel image that shows the issue and I'll have a look.

swarren commented Jul 22, 2012

For the framebuffer (not) painting issue, I'd guess an ARM-side caching issue; it needs to be mapped uncached or flush operations performed. I assume that VC-side caching is less of an issue for a dumb FB since it'd be a HW block scanning out the FB content, which would never cache, and not the VC CPU which might cache

rosery commented Jul 22, 2012

@popcornmix

kernel with fb set up in message channel followed by fb set up using the fb channel. same parameters. same buffer returned
It should start OK without a config.txt, using kernel_old=1 otherwise

Use this first. Once booted, typing 'desktop' will put it to the desktop. may need to click cancel if offered.

look at the mouse pointer for example. solid movement

https://www.dropbox.com/s/x4nmi3v7kp945qm/KwithFBchan.zip

This second one uses only the message channel. One line of code is commented out.. the line calling the subroutine that sends a message to a channel (subroutine is used for both the message channel and the framebuffer channel).

https://www.dropbox.com/s/o2l34d0waevvqc3/KnoFBchan.zip

Let me know if you cannot see any blatantly obvious difference.

rosery commented Jul 22, 2012

@swarren . Thanks for comment. see above. I'm referring to different behaviour in a framebuffer returned by the fb channel to that returned by the messaging channel. Behaviour should be identical.

thanks

rosery commented Jul 22, 2012

@popcornmix BTW screen display assumes 1920 x 1080 x 32bpp at the moment. it ignores alpha .. but then I guess you'd spot that from your VC side diagnostics...

@rosery Are you accessing the buffer through the frist mapping in the ARM MMU in both cases (i.e. setting the top bits to zero in the framebuffer channel case)?

rosery commented Jul 23, 2012

Hi Nathan

The frame buffer supplied by the VC side is accessed identically however
it is obtained. (physically.. hi nibble 0x0). That is the issue at stake.

I wonder if anyone else is yet using the messaging channel as the only
means to obtain a frame buffer, and if so, whether they see the same?

-- John Ballance C.Eng MIET - jwb@rosery.net - 07976 295923

On 23/07/2012 09:59, Nathan Phillips wrote:

@rosery Are you accessing the buffer through the ARM MMU in both cases (i.e. setting the top bits to zero in the framebuffer channel case)?


Reply to this email directly or view it on GitHub:
#47 (comment)

Contributor

popcornmix commented Jul 23, 2012

Sorry, potential cache fix didn't make it into official image. Your kenrel does work with my local image, which I've updated to earlier dropbox link.

rosery commented Jul 23, 2012

Ah that explains it. The version in the dropbox now does indeed behave as needed. Many thanks.. When do you think it'll reach github? ..mainly so I can easily point others at it

Many thanks

John

cycl0ne commented Aug 3, 2012

I tried yesterday this Mailbox Function:

My Results:
Response HDR: 80000000

(controlbits) (result1) x (result2)
Response: 0 32 x 0 // 32bit BPP
Response: 0 1600 x 1200 // Phys
Response: 0 1600 x 1200 //virt
Response: 0 0 x 0 // alpha
Response: 0 0 x 0 // offset
Response: 0 1 x 0 //pixelformat
Response: 0 8 x 0 // Allignment

I get a 80000000 for sucessfull, but no Adress?

Here my code:
hdr->size = buffer_size;
hdr->type = VCBUF_PROCESS_REQUEST;
ftr->end_tag = 0x00000000;

tag1->id = VCTAG_SET_BPP;
tag1->value_buffer_size = sizeof(unsigned int);
tag1->ctrl.bits.value_length = 0;
tag1->ctrl.bits.type = VCTAG_REQUEST;
tag1->value_buffer = bpp;
tag1->value_buffer2 = 0;

tag2->id = VCTAG_SET_PHYS_WIDTHHEIGHT;
tag2->value_buffer_size = sizeof(unsigned int)*2;
tag2->ctrl.bits.value_length = 0;
tag2->ctrl.bits.type = VCTAG_REQUEST;
tag2->value_buffer = size_x;
tag2->value_buffer2 = size_y;

tag3->id = VCTAG_SET_VIRT_WIDTHHEIGHT;
tag3->value_buffer_size = sizeof(unsigned int)*2;
tag3->ctrl.bits.value_length = 0;
tag3->ctrl.bits.type = VCTAG_REQUEST;
tag3->value_buffer = size_x;
tag3->value_buffer2 = size_y;

tag4->id = VCTAG_SET_ALPHA;
tag4->value_buffer_size = sizeof(unsigned int)*2;
tag4->ctrl.bits.value_length = 0;
tag4->ctrl.bits.type = VCTAG_REQUEST;
tag4->value_buffer = 0;

tag5->id = VCTAG_SET_VIRT_OFFSET;
tag5->value_buffer_size = sizeof(unsigned int)*2;
tag5->ctrl.bits.value_length = 0;
tag5->ctrl.bits.type = VCTAG_REQUEST;
tag5->value_buffer = 0;
tag5->value_buffer2 = 0;

tag6->id = VCTAG_SET_PIXELORDER;
tag6->value_buffer_size = sizeof(unsigned int)*2;
tag6->ctrl.bits.value_length = 0;
tag6->ctrl.bits.type = VCTAG_REQUEST;
tag6->value_buffer = 1;

tag7->id = VCTAG_ALLOCATE_BUFFER;
tag7->value_buffer_size = sizeof(unsigned int)*2;
tag7->ctrl.bits.value_length = 0;
tag7->ctrl.bits.type = VCTAG_REQUEST;
tag7->value_buffer = 8;
tag7->value_buffer2 = 0;

mbox_write_gen(MAILBOX_CHANNEL_ARM_TO_VC, arm_to_vc((void *)vc_mailbox_buf));
mbox_read_gen(MAILBOX_CHANNEL_ARM_TO_VC);

Am i missing something? Another thing: What are the options for the pixeltype? They are missing in the wiki?

cycl0ne commented Aug 4, 2012

my fault. i found my bug. allocated the memory wrong. sry

rosery referenced this issue Sep 15, 2012

Closed

VSync IRQ? #67

Contributor

popcornmix commented Dec 27, 2012

This should all be implemented now.

popcornmix closed this Dec 27, 2012

lurch referenced this issue in raspberrypi/linux Aug 1, 2013

Closed

Wrong (and missing) tag values in vcio.h #345

@neuschaefer neuschaefer pushed a commit to neuschaefer/raspi-binary-firmware that referenced this issue Feb 27, 2017

@popcornmix popcornmix Add sync_after_dma module parameter to kernel.
Tweak for hdmi mode selection when no detailed timing.
Add hdmi_ignore_hotplug, disable_l2cache_witealloc  and arm_control configs.
Add mailbox property channel, see github issue #47
facc30d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment